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SUMMARY 

This  research  addresses  the  problem  of  determining  the  existence 
of  a  representative  group/crew  learning  curve  (or  set  of  curves)  and  the 
development  of  a  mathematical  description  of  this  curve  applicable  to 
training  levels  In  operational  testing.  Emphasis  Is  placed  on  the 
analysis  of  data  from  actual  operational  test  reports. 

An  Iterative  procedure  Is  developed  to  analyze  sample  data  using 
regression  techniques  to  screen  data  for  suitability  and  to  fit  nonlinear 
learning  models. 

A  representative  learning  curve  for  the  data  analyzed  Is  selected 
by  comparing  the  sum  of  squares  regression  and  the  lack  of  fit  ratio 
for  each  model . 

This  comparison  shows  that  the  following  models  appeared  to 
provide  an  adequate  fit  to  the  data  analyzed. 

(1 )  Y  =  at"b 

(2)  Y  *  a [ e  +  (l-e)t'b] 

(3)  Y  =  at"b  +  C 

(4)  Y  =  aebt 

Since  the  variations  of  the  power  function,  models  (2)  and  (3)  did  not 
appear  to  provide  a  better  fit  to  the  data,  model  (1)  was  preferred 
from  th?  standpoint  of  parsimony.  It  cannot  be  stated  conclusively 
that  model  (1)  provides  a  statistically  better  fit  to  the  data  than 
model  (4).  However,  based  on  a  survey  of  industrial  applications  of 
the  power  function  model  as  reported  In  the  literature.  It  was  concluded 


that  the  model  Y  »  at‘b  does  adequately  fit  the  empirical  data  analyzed 
and  can  be  used  as  a  representative  group/crew  learning  model  for  this 


CHAPTER  I 


INTRODUCTION 


Background 

The  initial  direction  for  this  study  was  provided  in  a  research 
task  statement  by  the  U.S.  Army  Operational  Test  and  Evaluation  Agency 
(OTEA). 


Conduct  background  research,  including  literature  search 
covering  both  government  publications  and  the  general 
literature  and  field  visits  as  appropriate  to  identify  a 
general  case  learning  curve  (or  set  of  curves,  if  necessary) 
existing  in  current  test  data;  to  describe  this  curve  (or 
curves)  mathematically  in  a  manner  such  that  the  slope 
(first  derivative)  can  be  derived;  to  present  evidence  in 
support  of  the  validity  of  such  curves;  and,  to  prepare  a  set 
of  instructions  explaining  how  to  design  a  test  to  generate 
the  needed  data  and  then  treat  the  data  to  record  the  curves. 

OTEA  is  continually  required  to  assess  the  impact  of  the  training 
level  of  a  crew  or  unit  engaged  in  operational  tests.  This  assessment 
is  of  particular  importance  because  OTEA  has  the  mission  of  assisting 
in  the  planning,  directing,  and  evaluation  of  operational  testing  required 
during  the  materiel  requisition  process  of  all  major  systems  and  selec¬ 
ted  non  major  syster  Adequate  and  thorough  operational  testing  is 
essential  in  determining  an  item  or  system's  operational  suitability  and 
logistic  support  requirements  (1,2). 

Operational  Testing  (OT)  is  conducted  In  the  most  realistic  test 
environment  possible  and  utilizes  the  most  representative  configuration 
of  the  future  operational  system.  Because  operational  testing  is 
conducted  throughout  the  development  life  cycle  of  materiel,  it  is 


usually  begun  using  early  prototypes  and  continues  through  the  cycle 
by  using  production  models. 

To  enhance  the  validity  of  generated  test  data,  operational 
testing  must  be  conducted  by  troop  units,  support  personnel,  and  indivi¬ 
duals  who  will  actually  be  issued  the  materiel  for  use. 

Through  these  tests  a  comparison  is  made  between  new  materiel  and 
existing  equipment  being  operated  under  the  same  or  similar  mission 
profile.  This  testing  concept  greatly  assists  decision  makers  to 
accurately  assess  total  operational  suitability  from  a  doctrinal,  organi¬ 
zational  and  tactical  viewpoint,  and  to  collect  performance  and  reliabil¬ 
ity,  availability,  and  maintainability  data  that  closely  simulates 
that  which  would  be  experienced  after  the  materiel  is  issued  to  the  field. 
Results  of  testing  are  forwarded  through  channels  to  the  Army  Systems 
Acquisition  Review  Council  (ASARC),  with  final  decision  of  acceptance 
or  rejection  resting  with  the  Secretary  of  Defense  (3,4,5). 

Essentially,  the  assessment  of  crew  or  unit  training  levels  has 
traditionally  been  limited  to  qualitative  techniques  such  as  adminis¬ 
tering  a  proposed  training  program  (with  the  assumption  that  the  completed 
training  equals  a  given  training  level)  relying  on  ARMY  TRAINING  AND 
EVALUATION  PROGRAM  (ARTEP)  results,  or  using  military  judgement. 

Training  data  is  currently  overwhelmingly  qualitative,  where  as 
quantitative  data  is  much  to  be  preferred  in  operational  test  and 
evaluation. 

It  Is  generally  agreed  that  a  performance  curve  describing  the 
progress  of  training  is  an  asymptotic  "learning  curve".  Assuming  this, 
it  should  be  possible  to  use  the  slope  of  a  curve  as  a  measure  of  how 
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closely  a  unit  has  approached  the  asymptote.  The  slope  of  a  curve  may 
be  expressed  mathematically  and  can  be  treated  rigorously.  However » 
even  though  it  is  generally  accepted  that  the  individual  "learning  curve" 
follows  this  assumption  and  appears  to  be  robust,  it  cannot  be  assumed 
that  a  representative  "learning  curve"  for  a  crew  or  unit  has  these 
same  properties. 


Objective,  Procedure,  and  Scope 

Since  operational  testing  usually  involves  the  comparison  of 
baseline  systems  to  newly  developed  systems,  participants  are  initially 
determined  to  be  qualified  or  trained  on  the  baseline  system.  Prior  to 
the  actual  conduct  of  the  test,  refresher  training  and/or  contractor 
training  is  provided  on  the  new  system.  Through  the  use  of  randomiza¬ 
tion  and  test  design  the  effect  of  learning  during  the  test  is  generally 
expected  to  be  lessened. 

The  objective  of  this  study  is  to  determine  the  existence  of  a 
representative  learning  curve  (or  set  of  curves)  ar.d  develop  a  mathema¬ 
tical  description  of  this  curve  applicable  to  training  levels  in  opera¬ 
tional  testing. 

This  research  Involves  an  "after  the  fact"  analysis  of  data  from 
various  test  reports.  Empirical  data  was  collected,  primarily  from 
OTEA  test  reports  and  data  made  available  through  other  training  and 
analysis  agencies.  A  more  detailed  description  of  the  various  data 
collected  is  provided  in  Chapter  IV.  The  data  obtained  was  plotted 
using  consecutive  trials  versus  a  specified  performance  measure/measure 
of  effectiveness  (MOE)  in  order  to  determine  if  there  were  patterns 
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which  might  suggest  a  demonstrable  group  "learning  curve". 

Linear  regression  models  are  used  to  screen  sample  data  for 
suitability  and  further  analysis,  while  nonlinear  regression  models  are 
used  to  fit  learning  models  to  the  sample  data.  Additionally,  the  fitted 
learning  models  will  be  tested  for  adequacy  through  a  direct  examination 
of  residuals. 

The  scope  of  this  research  is  concentrated  on  the  analysis  of 
data  obtained  from  a  military  operational  testing  environment  in  which 
OTEA  operates.  A  survey  of  the  general  literature  is  conducted  to 
determine  the  existence  of  appropriate  industrial  studies  of  group  or 
team  learning  which  might  support  this  study. 

The  initial  background  search  involves  the  theory  of  learning 
along  with  the  use  and  development  of  learning  curves.  This  particular 
aspect  is  expanded  to  include  group  or  team  performance  (learning 
models  discussed  in  Chapter  II). 

The  remainder  of  the  study  involves  development  of  the  methodology 
employed,  a  description  of  data  collected,  and  a  discussion  of  results 
including  appropriate  recommendations  and  conclusions. 
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CHAPTER  II 

REVIEW  OF  APPLICABLE  LEARNING  THEORY  RESULTS 

This  chapter  contains  a  review  of  general  learning  theory  and  the 
development  of  learning  progress  or  performance  improvement.  It  further 
summarizes  the  application  of  learning  theory  concepts  to  group/team 
learning. 


Learning  Theory 

Learning  is  a  fundamental  process  of  life.  Every  individual 
learns  and  through  learning  develops  modes  of  behavior  by  which  he  lives. 
Learning  may  occur  intentionally,  through  organized  or  unorganized 
activity,  and  the  variables  which  influence  learning  may  be  grouped  under 
the  three  headings:  (!)  individual  variables,  such  as  capacity  and 
motivation;  (2)  task  variables,  such  as  meaningfulness  and  difficulty; 
and  (3)  environmental  variables,  such  as  practice  and  knowledge  of 
results  (6). 

The  learning  phenomenon  has  been  studied  by  philosophers  and 
psychologists  for  centuries,  in  fact  Aristotle  was  the  first  to  set 
forth  laws  in  an  attempt  to  explain  the  basis  of  learning  (7). 

In  Mednick's  book  (7,8),  learning  has  been  defined  in  terms  of 
four  characteristics.  These  are: 

1.  Learning  results  in  a  behavioral  change.  This  characteristic 
is  the  basic  goal  of  any  efforts  at  learning. 

2.  Learning  is  a  result  of  practice.  This  eliminates 
behavioral  changes  due  to  illness,  maturation,  or  motivation. 
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Although  performance  may  be  greatly  altered  by  these 
variables,  learning  Is  not. 

3.  Learning  Is  a  relatively  permanent  change.  A  task  which 
was  learned  sometime  in  the  past  can  be  easily  resumed 
after  a  little  practice. 

4.  Learning  is  not  directly  observable.  Performance  is 
affected  by  variables  other  than  learning.  Therefore, 
a  record  of  successive  performance  is  just  that,  and 
cannot  be  considered  an  exact  representation  of  the  : 
learning  process. 

Mathematical  Models 

In  order  to  measure  learning  or  compute  the  rate  of  learning, 
mathematical  models  were  developed.  Experiments  in  learning  phenomena 
are  generally  concerned  with  changes  in  some  evidence  of  learning  as  a 
result  of  experiences  on  discrete  trials.  In  most  paired-associate 
learning  paradigms  (models)  the  subject's  knowledge  is  tested  after 
every  exposure  to  the  correct  pairing  (9).  When  a  number  (whether  it 
be  a  probability  value  between  0  and  1,  or  some  integer  value)  changes 
as  a  result  of  discrete  opportunities,  we  ar'j  more  likely  to  find  more 
accurate  mathematical  analogies  in  difference  equations  than  in  differen¬ 
tial  equations.  But  difference  equations  were  not  known  to  psychologists 
until  the  late  1940's  and  early  1 950 ' s . 

Clark  L.  Hull  (10)  is  sometimes  considered  the  first  mathematical 
learning  theorist,  although  there  are  other,  earlier,  quantitatively 
oriented  theorists  (9).  The  genesis  of  Hull's  model  was  different  from 
that  of  current  models,  and  the  difference  is  a  critical  one.  The 
major  mathematical  technique  used  by  Hull  and  his  contemporaries  was 
curve  fitting.  For  Hull  this  meant  a  somewhat  arbitrary  selection  of 
one  from  the  many  equations  whose  form  would  be  compatible  with 
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previously  obtained  data.  Theory  dictated  the  selection  of  variables 
for  his  equations,  but.  the  precise  forms  of  the  equations  were  derived 
primarily  out  of  attempts  to  fit  past  data.  Hith  the  new  quantitative 
technique*  that  have  become  available,  it  is  now  possible  to  permit 
the  theory  to  imply  the  equation  form  directly,  prior  to  data  collection. 

The  capacity  to  derive  equations  from  theory,  and  to  see  how  these 
theoretically  derived  equations  conform  to  data  patterns,  is  what  is 
meant  by  a  true  analogy  between  theory  building  in  psychology  and  theory 
building  in  the  physical  sciences. 

A  further  change  from  the  past  in  learning  theory  that  appears 
to  be  fairly  general  in  more  recent  theory  building  is  the  abandonment 
of  the  belief  in  a  general  learning  function  that  should  cover  all 
learning  situations.  More  recent  thinking  recognizes  that  different 
theories,  and  therfore  different  mathematical  functions,  might  be 
required  for  different  learning  situations.  The  earlier  work  assumed 
that  a  finding  in  one  laboratory,  stemming  from  one  experimental 
paradigm,  could  contradict  the  theory  of  another  experimenter  using 
a  different  paradigm,  with  all  assumed  to  be  exploring  a  similar  process. 

learning  Curves 

When  several  trials  are  given  In  an  experiment  and  measures  of 
learning  or  of  retention  are  obtained,  these  measures  may  be  plotted 
in  the  graphic  form  known  as  a  learning  curve,  a  graph  which  affords 
a  comparison  of  the  performance  on  each  trial  with  a  performance  on  other 
trials  (6).  It  Is  customary  to  plot  the  Independent  variable  on  the 
horizontal  axis,  the  abscissa,  and  the  dependent  variable  on  the 
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vertical  axis,  the  ordinate.  The  dependent  variable  changes  as  a  result 
of  the  experimenter's  manipulations.  Scores  on  the  dependent  variable 
are  dependent  upon  or  are  the  function  of  the  experimental  factor  and 
are  usually  some  form  of  a  learning  score  -  error  made,  time  consumed, 
and  so  on. 

One  of  the  things  a  learning  curve  reveals  Is  the  rate  of  Improve¬ 
ment  and  the  changes  In  this  rate.  A  uniform  rate  of  Improvement  is 
Indicated  by  graphs  of  the  type  shown  In  Figure  2-1. 


FI  lure  2-1.  Theoretical  1  jam  in  ,  curves  sho./lao  zero 

acceleration,  or  a  uniform  rate  of  Improve¬ 
ment.  In  A  Improvement  Is  sho.n  by  an 
Increase  In  scores.  3  depicts  those 
learning  situations  wherein  decreasing 
scores  Indicate  Improvement,  such  as 
fewer  errors.  (6) 


Here  progress  Is  Indicated  by  a  straight  line.  Such  a  graph 
means  that  the  increment  of  gain  Is  the  same  for  each  successive  trial. 
When  the  rate  of  Improvement  Is  constant,  we  have  what  Is  known  as 


zero  acceleration. 


Most  curves  of  learning  show  variations  In  the  rate  of  improvement. 
Curves  for  motor  learning  usually  show  the  fastest  rate  of  gain  at 
the  beginning  ami  a  slowing  up  as  practice  continues.  Such  a  change  Is 
called  negative  acceleration. 

The  authors,  Garry  and  Kingsley,  state  that  this  should  not  be 
confused  with  a  loss  of  skill.  It  refers  to  those  cases  wherein  Improve¬ 
ment  Is  still  being  made,  but  the  Increment  of  gain  Is  smaller  on  each 
successive  trial.  'Theoretical  curves  for  negative  acceleration  are 
presented  In  Figure  2-2. 


Trials  Trials 


Figure  2-2.  Theoretical  curves  of 
negative  acceleration 
showing  a  decrease  In 
the  rate  of  gain  (6) 


In  the  cases  in  which  the  scores  grow  smaller  (time  scores  or 
error  scoros  on  successive  trials)  as  perforrmcc  improves,  negative 
acceleration  Is  indicated  by  a  downward  concave  curve.  Negatl/oly 


Scores 
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accelerated  curves  are  most  frequently  obtained  In  situations  where 

(1)  the  learning  task  Is  relatively  simple, 

(2)  the  subjects  are  of  average  or  above  ability  (either 
practiced  or  bright), 

(3)  there  Is  positive  transfer  from  previous  learning,  or 

(4)  the  tests  are  given  toward  the  end  of  a  series  of  trials. 
Sometimes  there  Is  very  slow  progress  at  the  start,  with  an 

Increase  In  the  Increments  of  Improvement  as  practice  Is  continued. 

This  Increase  in  the  rate  of  Improvement  is  called  positive  acceleration, 
see  Figure  2-3. 


Figure  2-3.  Two  theoretical  curves  of  positive 
acceleration.  In  both,  the  rate  of 
improvement  Is  faster  in  the  second 
half  of  tho  1  earning  porlod  than  in 
the  first  part  (5) 


Curves  of  positive  acceleration  are  frequently  founl  in  motor 
learning  or  where  previous  learning  interferes  with  the  nev  learning. 
It  <s  clear  that  positive  acceleration  annot  continue  indefinitely. 
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for  sooner  or  later  the  learner  reaches  complete  mastery  or  the  curve 
levels  off  as  he  approaches  the  limit  of  his  ability  to  Improve  (6). 

It  Is  likely  that  If  we  were  able  to  plot  a  complete  learning 
curve  from  zero  to  the  absolute  limit  of  Improvement  for  any  single 
performance,  we  should  find  the  S-shaped  curve  with  relatively  slow 
progress  at  first  followed  by  increasing  Increments  of  gain  and  leveling 
off  with  decreasing  gains  as  the  limit  was  approached  (6). 

It  may  be  presumed  that  a  very  rapid  Initial  rise  in  a  learning 
curve  is  due  to  the  fact  that  the  learning  task  la  not  altogether  new 
to  the  learner  and  that  he  therefore  does  not  begin  at  a  ^zero  point. 

The  slowing  down  of  the  rata  of  Improvement  may  be  caused  by 
several  factors  such  as  reaching  the  limit  of  improvement,  fatigue, 
loss  of  Interest,  a  sense  of  sufficiency,  lack  of  desire  for  further 
advancement,  and  the  needless  repetition  or  overlearning  of  parts  of 
the  performance  mastered  In  the  early  steps  of  learning. 

The  absolute  limit  of  performance  is  rarely  reached.  In  most 
Instances,  practical  limits  and  motivational  limits  are  the  determinant 
factors. 

Burns  (7)  reports  that  the  first  publication  leading  to  the 
Industrial  application  of  the  learning  curve  has  been  credited  to  T.P. 
Wright.  Wright  (11)  showed  that  as  the  number  of  aircraft  produced  i 
increases,  the  cumulative  average  per  unit  cost  to  procuce  an  aircraft 
decreases  at  a  constant  rate.  The  model  employed  was  Y  a  KXC,  where 

Y  a  the  number  of  direct  labor  man  hours  required 
to  produce  the  Xth  unit 

K  =  the  number  of  direct  labor  man  hours  required 
to  produce  the  first  unit 
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X  «  the  unit  number 

C  *  *,  where  B  equals  the  learning  curve  factor, 

109  c  a  constant  (.90,  .85,  .77,  etc.) 

The  mathematical  function  Is  called  an  Inverse  variation  and  means  that 

the  dependent  variable  (Y)  gets  smaller  as  the  Independent  variable  (X) 

gets  larger.  This  relationship  Is  also  referred  to  as  an  exponential 

(log-linear)  equation.  For  a  given  learning  curve,  K  and  c  are  constants 

where  K  can  assume  any  positive  value  and  c  Is  a  constant  between  ' 

zero  and  minus  one  (12,13). 

This  has  since  become  known  as  the  cumulative  average  theory  of 
the  learning  curve  (14).  Since  this  first  publication,  learning  curve 
theory  has  been  extended  Into  many  areas  ranging  from  the  setting  of 
contract  prices  to  production  planning  and  control  (15).  In  situations 
where  the  learning  curve  principles  can  be  applied,  the  government  Is 
also  using  It  in  evaluating  contract  proposals. 

In  a  related  article  (16),  J.D.  Patton  states  that  the  manufactur¬ 
ing  progress  curve  Is  often  referred  to  as  a  learning  curve.  He  asserts 
that  Improvements  usually  come  from  tool  design,  methods,  materials, 
procedures,  as  well  as  the  employee's  learning.  This  concept  Is  also 
useful  in  the  areas  of  training,  maintenance,  and  other  logistics 
concerns  He  further  states  that  the  manufacturing  progress  function 
Is  assumed  to  describe  a  constant  percentage  improvement  as  the  produc¬ 
tion  quantities  double  and  that  all  progress  functions  will  have  the 
same  shape,  even  though  they  may  differ  in  the  percentage  Improvements 
between  doubled  production  quantities  and  the  direct  labor  hours 
required  to  complete  the  first  unit.  This  progress  learning  curve 
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utilizes  the  power  function,  Y  *  KXC  developed  by  Wright  (11). 

An  alternative  model,  Y  -  ota  3  was  presented  by  Pegels  (17). 

He  states  that: 

The  startup  or  learning  curve  literature  has  'in  the  p<»st 
concentrated  mainly  on  the  algebraic  pov/-^  function  or  on 
versions  based  on  this  function.  This  concentration  Is  not 
unusual  because  the  power  function  has  proven.  In  numerous 
studies,  to  fit  empirical  data  quite  well.  However,  other 
easy-to-apply  algebraic  functions  should  also  be  analyzed 
and  considered.  One  such  function,  an  exponential  function. 

Is  shown  to  provide  a  better  fit  to  several  sets  of  empirical 
data  than  the  traditional  power  function. 

The  other  alternative  models  to  which  Pegels  refers  were  usually 
intended  for  specific  applications  or  contained  restrictive  assumptions. 
He  specifically  mentioned:  (1)  An  S-type  function  proposed  by  Carr  (18) 
which  was  based  on  the  assumption  of  a  gradual  startup.  An  S-type 
function  has  the  shape  of  the  cumulative  normal  distribution  function 
for  the  startup  curve  and  the  shape  of  an  operating  characteristics 
function  for  the  learning  curve,  (2)  Guibert  (19)  proposed  a  complicated 
multi  parameter  function  with  several  restrictive  assumptions,  (3)  De  Jong 
(20)  proposed  a  version  of  the  power  function  which  generates  two 
components,  a  fixed  component  which  is  set  equal  to  the  irreducible 
portion  of  the  task,  and  a  variable  component,  which  is  subject  to 
learning. 


Y  -  a[0  +  (1  -  e)X“b] 


Oe  Jong  calls  this  fixed  component,  the  "factor  of  incompressibility". 
He  explains  that  this  factor  is  dependent  not  only  on  the  nature  of  the 
work  but  also  upon  the  commencing  combination  of  skill  and  familiarity 
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with  the  work  In  hand.  The  times  for  manual  operations  per  cycle  will 
fall  gradually,  but  not  to  zero  as  porposed  oy  the  standard  power 
function  (Wright)  at  Infinity.  They  will  tend  to  approach  a  certain 
limiting  value.  (4)  Levy  (21)  oresented  a  learning  function  which  reaches 
a  plateau  and  does  not  continue  to  decrease  or  increase  as  does  the  power 
function. 

An  overriding  point  expressi-d  was  that  there  are  no  specific 
learning  curves  which  have  universe!  application. 

Thus  far,  the  discussion  of  learning  and  learning  curves  has  been 
focused  on  the  general  theory,  aspects  of  individual  learning  curves 
and  some  industrial  applications  of  learning  curve  theory.  This  back¬ 
ground  will  now  be  used  to  expand  into  the  area  of  group/team  traininq 
and  performance. 


Group/Team  Training  and  Performance 
Several  studies  and  laboratory  experiments  have  been  conducted 
in  the  area  of  group/team  training  and  performance.  Some  of  these  take 
the  form  of  a  literature  survey  on  publications  relevant  to  team  training 
and  evaluation,  while  others  report  on  actual  laboratory  cases  or 
experiments  concerning  team  function,  structure  and  performance. 

A  distinction  was  drawn  between  the  terms  team  and  small  group. 
Glaser,  Klaus  and  Egerman  (22,23)  state  that  although  both  refer  to 
collections  of  individuals  acting  in  consort,  a  team  is  usually  well 
organized,  high’ v  structured,  and  has  relatively  formal  operating 
procedures. . .as  exemplified  by  a  baseball  team,  an  aircraft  crew,  or  a 
ship  control  team.  Teams  generally  display  the  followi/ng  characteristics: 
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1.  relatively  rigid  in  structure,  organization,  and 
communication  networks, 

2.  have  well  defined  positions  or  member  assignments  so 
that  the  participation  in  a  given  task  by  each  individual 
can  be  anticipated  to  a  given  extent, 

3.  depend  on  the  cooperative  or  coordinated  participation  of 
several  specialized  individuals  whose  activities  contain 
little  overlap  and  who  must  each  perform  their  task  ft 
least  at  some  minimum  level  of  proficiency, 

4.  are  often  involved  With  equipment  or  tasks  requiring 
perceptual -motor  activities. 

5.  can  be  given  specific  guidance  on  job  performance  based 
on  a  task-analysis  of  the  team's  equipment,  mission, 

or  situation  (23). 

A  small  group,  on  the  other  hand,  rarely  is  so  formal  or  has  well- 
defined,  specialized  tasks  —  as  exemplified  by  a  jury,  a  board  of 
trustees,  or  a  personnel  evaluation  board  (23).  As  contrasted  with  a 
team,  small  groups  generally  have  the  following  characteristics: 

1.  have  an  Indefinite  structure,  organization,  and 
communication  network, 

2.  have  assumed  rather  than  designated  positions  or 
assignments  so  that  each  individual's  contribution 
to  the  accomplishment  of  the  task  is  largely  depen¬ 
dent  on  his  own  personal  characteristics, 

3.  depend  mainly  on  the  quality  of  independent,  individual 
contributions  and  can  frequently  function  well  even 
when  one  or  several  members  are  not  contributing  at  all, 

4.  are  often  involved  with  complex  decision-making 
activities , 

5.  cannot  be  given  much  specific  guidance  beforehand  since 
the  quality  and  quantity  of  participation  by  Individual 
members  is  not  known. 

In  a  review  of  team  training  and  evaluation  by  the  Human  Resources 
Research  Organization  (HUMRRQ)  (24),  the  authors  state  that  the  review 
was  undertaken  1r  order  to  provide  an  information  base  that  the  Defense 
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Advanced  Research  Projects  Agency  could  use  as  a  foundation  to  facilitate 
decisions  regarding  future  research  program  support. 

The  technical  report  (24)  reported  the  following  findings  and 
implications: 

As  an  aid  toward  organizing  and  analyzing  the  team  training 
information  obtained,  a  classification  scheme  was  used  to  categorize  the 
training  techniques  and  situations  discussed  in  this  review  along  two 
dimensions.  On  one  dimension,  training  focus,  a  distinction  was  made 
between  "team"  training  and  'multi-individual"  training.  Multi-individual 
training  occurs  in  a  group  context  but  focuses  on  the  development  of 
individual  skills.  Team  training,  on  the  other  hand,  is  focused  on 
developing  team  skills  such  as  coordination  and  cooperation.  The  type 
of  task  situation  was  the  second  dimension  used  to  classify  the  training 
techniques  reviewed.  Task  situations  were  categorized  as  either  "estab¬ 
lished"  or  "emergent."  Established  situations  are  those  in  which  the 
tasks  and  the  activities  required  to  perform  these  tasks  can  be  almost 
completely  specified.  Emergent  situations  are  those  in  which  all  tasks 
and  activities  cannot  be  specified  and  the  probable  consequences  of  certain 
actions  cannot  be  predicted.  This  type  of  situation  allows  for  unantici¬ 
pated  behaviors  to  emerge. 

Team  training  studies  and  practices  were  categorized  according 
to  the  classification  scheme  described.  These  studies  followed  two 
conceptual  models  of  team  behavior-response  (S-R)  and  organismic.  The 
S-R  model  adherents  tended  to  study  team  training  in  laboratory  settings 
derived  from  established  task  situations.  More  realistic  environments 
were  used  by  other  researchers  who  attended  to  emergent  factors  in  the 
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job  situation  (the  organismic  approach).  It  was  this  latter  group  of 
investigators  who  demonstrated  the  need  for  training  in  team  skills, 
even  though  individual  skill  proficiency  was  found  to  be  a  prerequisite 
for  effective  team  training  and  performance,  other  conclusions  which 
were  drawn  from  the  literature  are: 

1.  The  team  context  is  not  the  proper  location  for  initial 
individual  skill  acquisition. 

2.  Performance  feedback  is  critical  to  the  learning  of  team 
skills,  as  well  as  individual  skills. 

Several  examples  of  team  training  techniques  currently  in  use  in 
the  military  services  are  also  presented  in  the  report;  for  example, 

ARMY  TRAINING  AND  EVALUATION  PROGRAM  (ARTEP),  REALTRAIN,  Naval  Training 
Device  Center  (NAVTRADEVCEN)  program,  etc. 

In  the  Final  Summary  Report  by  Klaus,  Glaser,  and  others  (23), 
a  brief  description  of  the  seven  studies  undertaken  are  briefly 
described  along  with  their  purpose  and  major  results. 

Report  1  described  the  approach  being  examined  in  the  Team 
Training  Laboratory,  one  which  considered  the  team  and  its  output  or 
product  rather  than  the  performance  of  its  individual  members  as  the 
focus  of  investigation  (25). 

Report  2  reported  on  the  acquisition  and  extinction  of  a  team 
response,  a  demonstration  that  basic  principles  of  individual  learning 
could  be  applied  to  the  team  considered  as  a  single  entity  (26). 

Report  3  presented  ari  experiment  on  the  inclusion  o'  parallel  or 
"redundant"  members  In  a  team  which  confirmed  an  hypothesis  derived  from 
the  underlying  approach  that  redundancy  could  result  in  eventual 
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decrements  In  team  performance  (27). 

Report  4  further  analyzed  the  effects  of  internal  team  structure 
on  the  development  and  lalntenance  of  a  team  response  based  upon  the  degree 
of  correspondence  between  individual  performance  and  feedback  supplied  to 
the  team  (28). 

Report  5  identified  the  relationships  among  team  member  character¬ 
istics,  the  conditions  of  team  training  and  the  speed  and  thoroughness 
with  which  teams  developed  proficiency  that  could  be  demonstrated 
empirically  (29). 

Report  6  explained  the  value  of  more  gradually  introducing  the  low 
ratios  of  reinforcement  typical  of  early  team  performance  providing 
supplemental,  supervisory- furnished  feedback  to  team  members  (30). 

Report  7  presents  three  studies  on  the  simulation  of  team 
environment  which  considered  the  degree  to  which  the  approach 
facilitated  the  replication  of  team  learning  phenomenon  based  on  the 
performance  of  a  single  individual  (31). 

The  studies  enabled  the  researchers  to  derive  a  learning  theory 
model  of  team  performance  from  among  those  psychological  models  of 
individual  behavior  which  have  proved  most  useful  in  understanding  the 
conditions  likely  to  affect  training  practice. 

The  underlying  model  has  three  essential  features  (24).  First 
a  team  Is  a  functioning  entity  having  an  output  which  depends  on  a 
defined  input  from  Its  members.  Second,  a  team  itself  can  be  considered 
as  the  module  of  investigation  and  its  responses  as  amenable  to  manipula¬ 
tion  without  necessary  reference  to  the  performance  of  individual  team 
members.  Third,  team  performance  can  and  will  vary  as  a  function  of  the 
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consequences  of  responses  much  the  same  as  the  performance  of  an 
Individual  learner. 

In  Technical  Report  1  (25),  the  first  team  acquisition  curve 
obtained  in  the  Team  Training  Laboratory  is  shown  in  the  bottom  half  of 
Figure  2-4. 

The  curve  is  a  plot  of  the  number  of  correct  team  responses 
per  experimental  period.  It  appears  from  the  correspondence  between 
the  two  curves  that  the  team  response  shows  acquisition  characteristics 
similar  to  an  individual  response.  The  authors  state  that  the  apparent 
improvement  in  team  performance  leading  to  an  asymptote,  can  tentatively 
be  explained  on  the  basis  of  a  temporary  reduction  in  individual 
proficiency  upon  entering  a  team  reinforcement  situation.  Thus,  the 
fact  that  the  team  changes  in  proficiency  as  a  result  of  training  does 
not  require  assumptions  as  to  characteristics  of  a  team  which  are  over 
and  above  the  learning  characteristics  possessed  by  its  individual  members. 

This  study  is  concerned  with  group  or  team  models,  where  the  data 
was  obtained  from  operational  tests.  The  type  of  tasks  involved  are 
those  which  depict  learning  situations  wherein  decreasing  scores  indicate 
improvement,  such  as  fewer  errors  or  decreasing  performance  times  on 
successive  trials.  Therefore,  the  learning  curves  are  expected  to  follow 
some  form  of  the  negative  acceleration  theoretical  curve  model. 

Since  the  team/crews  are  organized  into  two  or  more  members 
(tank  crew,  mortar  crew)  their  organization  is  characteristic  of  those 
described  by  Glaser,  Klaus,  and  Egerman  (23).  In  that  context  the 
basic  principles  of  individual  learning  curve  robustness  will  be 
assumed  and  analysis  of  the  empirical  data  will  procede  along  that  line. 
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Figure  2-4.  Comparison  of  Individual  and 
Team  Learning  Curves  (25) 
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Various  models  described  previously,  such  as  the  power  function 
with  variations  and  exponential  models,  will  be  used  to  fit  the  empirical 
data  and  then  analyzed  for  model  adequacy.  The  methodology  used  to 
tie  empirical  data  and  analyze  results  will  be  discussed  In  Chapter 
III. 

It  was  made  clear  through  contacts  with  other  sources  of  data 
that  considerable  Interest  Is  presently  being  generated  In  the  area  of 
group/team  learning.  Several  proposed  tests  are  being  considered  to 
analyze  group  learning.  As  discussed  earlier,  the  analogy  between 
Individual  learning  and  group  learning  suggests  the  substitution  of 
the  organization  for  the  Individual  when  using  the  classical  learning 
model . 

The  Training  and  Doctrine  Command  (TRADOC)  has  conducted  an 
extensive  study  Into  training  cost  procedures  and  the  utilization  of 
learning  curve  theory  in  the  assessment  of  training  proficiency.  These 
studies  Include  the  assessment  of  both  Individual  and  group  learning 
models  along  with  validated  performance  measures.  The  Army  Research 
Institute  (ARI)  has  also  planned  tests  which  will  attempt  to  make 
an  assessment  of  group  training. 
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CHAPTER  III 
METHODOLOGY 

One  of  the  principle  objectives  of  this  research  Is  to  determine 
the  existence  of  a  representative  learning  curve  (or  set  of  curves)  and 
to  develop  a  mathematical  description  of  this  curve  applicable  to 
training  levels  In  operational  testing.  The  existence  of  a  representa¬ 
tive  learning  curve  could  be  used  to  develop  Improved  operational  test 
and  evaluation  methodology  for  training  effectiveness.  To  determine 
whether  there  Is  a  demonstrable  learning  curve  for  team/crew  performance. 
It  was  necessary  to  collect  and  analyze  data  from  operational  test 
reports.  Each  data  set  will  be  analyzed  Iteratively  utilizing  the 
following  procedures. 

1.  Detenmine  graphically  If  learning  patterns  exist.  Sample 
data  will  be  plotted  to  determine  If  there  are  patterns  in  the  empirical 
data  which  might  suggest  that  learning  can  be  detected.  The  performance 
measure  is  plotted  against  consecutive  trials. 

2.  Fit  Linear  Model. 

Simple  linear  regression  Is  used  to  fit  the  linear  model  to 
empirical  data  and  the  null  hypothesis,  that  the  slope  of  the  regression 
line  Is  equal  to  zero,  will  be  tested.  In  data  sets  where  the  time 
component  or  measurement  of  error  Is  used  as  a  performance  measure, 
the  slope  of  the  regression  line  Is  expected  to  be  negative  and  should 
not  Include  zero  In  the  confidence  Interval  constructed  around  the 
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slope.  This  condition  reflects  that  there  Is  an  Indication  of  learning 
In  the  data.  If  no  learning  Is  detected  the  data  Is  not  subjected 
to  further  analysis. 

3.  Fit  Nonl Inear  Model . 

Upon  determining  the  suitability  of  the  data,  that  Is,  graphically 
detecting  discernible  patterns  and  rejecting  the  null  hypothesis  that 
the  slope  of  the  regression  line  is  zero,  nonlinear  models  are  used 
to  fit  the  data.  These  include  learning  models  suggested  In  the  litera¬ 
ture  and/or  variations  based  on  the  graphical  patterns  of  the  raw  data 
(see  Table  3-1).  The  selection  of  models  is  restricted  to  functional 
relationships  between  two  variables  whereby,  the  performance  measure 
(Y)  can  be  separated  from  the  trials  (t)  in  such  a  way  that  Y  ■  t'(t). 
Using  this  relationship,  the  performance  measure  is  considered  to  be 
the  dependent  variable  and  the  consecutive  trial  is  the  independent 
variable.  Parameter  estimates  and  a  residual  sum  of  squares  are 
obtained  bv  fittinq  the  nonlinear  model. 

4.  Test  for  Model  Adequacy. 

The  assumption  is  made  that  the  learning  model  fit  in  Step  3 
is  adequate.  A  test  for  "goodness  of  fit"  of  the  model  Is  used  to 
verify  that  assumption  utilizing  the  analysis  of  variance  conducted 
for  the  significance  of  regression.  A  lack  of  fit  test  is  performed 
when  repeat  observations  in  the  data  are  available.  This  is  done  by 
constructing  a  lack  of  fit  ratio  which  will  be  discussed  later. 
Additionally,  the  statistical  Inferences  on  the  model  are  checked  through 
a  direct  examination  of  residuals.  Model  adjustments  are  made  based 
on  this  examination  of  residuals  and  a  careful  examination  of  outliers 


Table  3-1  . 


Model 


Y  =  a[g+(l-e)t‘b] 

Y  =  aCa^1]  +  » 


A 

Y 

A 

Y 


ae^  * 
at'b  +  c  * 


* 


Model s 


Orlg 


T.P.  Wright  (11) 

De  Jong  (20 ) 

Pegels  (17 ) 

♦models  suggested 
by  graphical 
patterns  in  the 
data  (32) 
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If  any.  When  adjustments  are  made,  the  Iterative  procedure  returns 
to  step  3  and  the  model  Is  refit  and  tested  for  adequacy. 

At  this  point  another  learning  model  or  adjusted  model  Is 
fit  to  the  sample  data  and  checked  for  model  adequacy. 

After  fitting  all  selected  models  for  a  particular  data  sample, 
a  comparison  of  models  Is  conducted  In  step  5  and  a  new  data  set  Is 
Introduced  at  step  1 . 

5.  Selection  of  "Best"  Model. 

The  criterion  for  evaluating  the  iitted  learning  models  and 
selecting  the  model  that  provides  the  "best"  fit  to  the  empirical  data 
will  be  based  on  the  comparison  of  (1)  the  lack  of  fit  ratio,  and 
(2)  the  sum  of  squares  for  regression  (SSR,  the  amount  of  variation  In 
the  model  explained  by  regression).  This  criterion  is  used  because 
it  is  a  systematic  and  quantitative  basis  for  selecting  the  "best" 
model . 

The  general  procedures  used  in  fitting  the  selected  mathematical 
models  to  the  empirical  data  and  analyzing  the  models  for  adequacy 
involve  regression  techniques.  These  techniques  provide: 

(1)  Parameter  estimates  for  a  given  model. 

(2)  A  measure  of  the  error  involved  in  estimating  the  parameters 
and  the  error  variance  around  the  fitted  model:.  The  sum  of  squares 

due  to  error  is  the  amount  of  noise  left  in  the  data  after  the 
regression  line  has  been  fit..  Where  applicable,  repeat  observations 
are  used  to  partition  the  error  component  into  two  parts,  sum  of  squares 
due  to  pure  error  (random  component)  and  sum  of  squares  due  to  lack  of 
fit  (bias  component).  Normally,  the  data  collected  during  operational 
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tests  do  not  contain  repeat  observations  over  trial s,  therefore,  an 
estimate  of  the  sum  of  squares  due  to  pure  error  Is  computed  using 
different  crew  observations  over  a  specific  trial.  This  actually 
represents  a  measure  of  the  random  error  between  subjects  (crews). 

The  regression  procedures  used  are  discussed  in  the  following 
sections. 


Fitting  Linear  Models 

As  stated  previously,  linear  regression  will  be  used  to  fit 
the  linear  model 

Y.j  *  Sq  +  $i^-j  ^  1  e  1»  2,  3, . . .  ,n  (3-1 ) 

where  t  is  the  ith  consecutive  trial  of  the  empirical  data  from 
various  test  reports.  For  a  given  trial  t,  a  corresponding  observation 
Y  consists  of  the  value  bq  +  e^t  plus  an  amount  e,  the  Increment  by  which 
any  Individual  Y  may  fall  off  the  regression  line.  Bq  and  e1  arre  the 
linear  parameters  in  the  model  and  are  unknown  as  well  as  e,  the  error 
or  noise  component  which  changes  for  each  observation  Y.  The  objectives 
of  this  model  are 

(1)  Estimate  Bq.  B-| 

(2)  Screen  data  for  suitability 

The  least-squares  method  Is  used  to  estimate  the  parameters  eQ 
and  b-j.  This  method  minimizes  the  sum  of  squares  of  deviations  from 
the  true  line  and  Is  written  (33) 


If t  a  t^(Yt  "  eo  *  ^l1^2 


(3-2) 


Estimates  are  chosen  for  gQ  and  g-j  which  produce  the  least  possible 
value  of  S. 

The  usual  basic  assumptions  for  this  ?odel  were  made 

p 

(1)  e1  Is  a  random  variable  with  mean  zero  and  variance  o 

2 

(unknown),  that  Is,  E(e^)  *  0,  V(c^)  *  a 

(2)  and  are  uncorrelated,  tyj,  so  that  COV  (ej.ej)  *  0. 

Thus,  E(Yi )  *  B0  +  V(Y1 )  »  a2  and  Yj  and  Yj, 

Ifj  are  uncorrelated. 

Recall  that  the  linear  model  Is  fit  to  develop  some  Idea  of 
the  relationship  of  the  performance  measure  over  consecutive  trials. 
When  estimates  of  the  parameters  gQ  and  are  obtained,  a  screening 
process  Is  conducted  to  look  at  the  slope  (g^)  of  the  fitted  model. 
This  screening  process  Is  used  to  determine  If  there  Is  an  Indication 
of  learning  over  consecutive  trials.  We  use  the  value  from  the 
t-dlstrlbutlon  table  (with  the  appropriate  degrees  of  freedom)  to 
obtain  an  estimate  at  a  given  level.  We  compare  this  value  with  the 


ratio  given  by 


B1  ~  610 
XX 


where  MSE  Is  an  estimate  of  the  variance  and  Sxx  is  the  corrected  sum 
of  squares  of  the  trials.  From  this  we  would  get  some  approximate  Idea 
of  whether  or  not  the  slope  Is  negative. 
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Since  the  performance  measures  In  the  data  collected  are  time 
components  and  measurements  of  error  over  consecutive  trials,  a  negative 
slope  for  the  regression  line  would  Indicate  that  learning  Is  taking 
place  over  consecutive  trials.  The  hypothesis  test  on  the  slope  can  be 
modified  since  -  0  to  test  for  the  significance  of  Regression  and 
an  Analysis  of  Variance  can  be  conducted.  For  a  further  discussion  of 
this  procedure  see  Draper  and  Smith  (33). 

Fitting  Nonlinear  Node! 

When  hypothesis  testing  conducted  after  fitting  the  linear  model 
Indicates  that  learning  can  be  detected  in  the  data,  the  nonlinear 
learning  models  mentioned  earlier  are  fit  to  the  data.  Parameter 
estimates  are  obtained  along  with  the  residual  sum  of  squares  for  use 
In  the  model  adequacy  test. 

The  SPSS  (Statistical  Package  for  the  Social  Sciences)  Subprogram 
NONLINEAR  (34)  is  used  to  apply  nonlinear  regression  analysis  to 
estimate  parameters  that  appear  in  the  regression  model  In  a  nonlinear 
fashion.  The  form  of  the  learning  models  in  Table  3-1  are  known 
explicitly  or  come  from  an  Interpretation  of  the  graphical  patterns 
in  the  data.  The  SPSS  NONLINEAR  program  utilizes  the  Least  Squares 
Estimation  function  to  estimate  the  unknown  parameters  by  minimizing 
the  error  sum  of  squares.  For  each  case,  the  performance  measure 
(dependent  variable)  Is  defined: 

Y1  *  f^t.e)  +  cr  1  =  1,  2 


>  •  •  •  » 


n 


(3-3) 
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where  f^(t,e)  stands  for  the  model  function  chosen,  e.  is  the  error 
term,  and  e  is  a  vector  of  parameter  estimates. 

p 

The  assumptions  made  are  E(e)  =  0  and  V(e)  =  a  .  The  error  sum 
of  squares  function  can  be  written  ar 

S(Q)  «  ll  Y.  -  fj(tJ.,0)J2  (3-4) 

The  program  minimizes  the  sum  of  squares  for  the  model  f^(t,e)  by 
choosing  suitable  values  for  the  unknown  parameters  (e)  in  the  model. 
This  in  turn  will  describe  as  close  as  possible  the  behavior  of  the 
dependent  variable  Y. 

Marquardt's  nonlinear  minimization  technique  is  used  to  estimate 
the  unknown  parameters.  It  is  a  compromise  between  the  linearization 
(or  Taylor  series)  method  and  the  steepest  descent  method  and  appears 
to  combine  the  best  features  of  both  while  avoiding  their  most  serious 
limitations.  It  almost  always  converges  and  does  not  slow  down  as  it 
approaches  the  ..olutlon. 

The  Idea  of  Marquardt’s  method  can  be  explained  briefly  as 
follows  (33,34).  We  start  from  a  certain  point  in  the  parameter  space, 
e.  The  method  of  steepest  descent  Is  applied  and  a  certain  vector 
direction,  <5g  where  g  stands  for  gradient,  Is  obtained  for  movement 
away  from  the  initial  point.  Because  of  attenuation  in  the  S(e)  but 
may  not  be  the  best  overall  direction.  However,  the  best  direction 
must  be  within  90°  of  <5g  or  else  S ( © )  will  get  larger  locally.  The 
linearization  (or  Taylor  series)  method  truncated  after  the  second  term 


leads  to  another  correction  vector  6  given  by  the  linear  model 


eo =  <zi  V'z'‘¥-fo>  (3-5> 

where  8g  is  the  parameter  estimate  vector,  Zg  is  an  nxp  matrix  containing 
the  first  partial  derivatives  and  Zg  is  its  transpose  matrix,  and 
(Y  -  fg)  is  a  vector  containing  the  residuals  (actual  observation  - 
predicted  value). 

However,  instead  of  using  the  linear  model  to  solve  for  the 
parameter  estimates,  Marquardt's  method  uses  the  following  equation: 

».  ■  (zi  z„ +  * r>"’  zo  <Y-fo>  <3-6> 

where  I  is  the  identity  matrix  and  x  is  a  correction  factor.  For  the 
first  iteration  x  is  set  to  zero  and  it  remains  zero  for  all  subsequent 
iterations  as  long  as  the  sum  of  squares  function  is  reduced.  If  at 
some  iteration,  say  iteration  r,  the  sum  of  squares  function  is 
increased,  then  x  is  replaced  with  the  following  expressions: 

\  +— - 

bJUJ  z  +  xipur 

and  the  solution  in  (3-6)  is  tried  again.  (This  correction  tends  to 
reduce  the  Euclidean  norm  of  to  one-half  its  previous  value).  The 
value  of  x  is  corrected  repeatedly  until  the  sum  of  squares  function  is 
reduced  (or  until)  the  members  in  8  are  too  small  to  be  meaningful. 
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l.e.,  the  norm  of  $r  has  been  reduced  beyond  a  tolerance  level  (34). 

Since  the  program  requires  Initial  estimates  of  the  unknown 
parameters,  a  computer  program  was  used  to  provide  them  using  data  from 
the  test  reports  and  is  listed  in  Appendix  B. 

After  the  nonlinear  model  Is  fit,  a  direct  examination  of 
residuals  Is  conducted  and  a  lack  of  fit  ratio  Is  computed  for  comparison 
with  other  models. 

If  the  original  observations  of  a  sample  data  set  do  not 
conform  to  the  model  assumptions  made,  then  a  log  transform  of  the  model 
may  possibly  correct  the  problem.  When  a  direct  examination  of  the 
residuals  for  a  model  indicates  that  the  error  component  is  multipli¬ 
cative  instead  of  additive,  then  the  log  transform  of  the  model  should 

be  computed  and  fitted  to  the  sample  data.  For  example,  the  model 

A  -b  A  -b 

Y  =  at  has  multiplicative  error  when  expressed  Y  -  at  e  and  additive 

*  _k 

error  when  expressed  as  Y  =  at  +  e.  In  the  former  case  the  log 
transform  can  be  specified  as  !nY  =  Ina  -  bint  +  Inc  but  in  the  latter 
case  the  log  transform  cannot  be  specified.  The  multiplicative  error 
is  exemplified  when  variability  becomes  a  function  of  the  magnitude  of 
the  responses  such  as  cases  where  large  errors  are  linked  with  large 
responses. 

When  the  log  transform  model  is  linear  it  is  fit  using  step  2, 
when  otherwise  specified  step  3  is  used,  and  then  tested  for  model 
adequacy.  When  comparisons  are  made  between  the  log  transform  models 
and  nonlinear  models  in  step  5  of  the  iterative  process,  the  parameter 
estimates  must  be  converted  in  order  to  compare  sum  of  squares. 
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Model  Adequacy 

As  stated  previously,  the  learning  models  chosen  to  fit  to  sample 
data  from  the  various  test  reports  are  assumed  to  be  tentatively  correct. 
Under  certain  conditions  we  can  check  whether  or  not  the  models  are 
correct.  This  will  be  done  by  testing  for  model  adequacy  using  a 
"goodness  of  fit"  test  and  through  a  direct  examination  of  residuals. 

The  residual  at  each  trial  is  defined  as  the  amount  by  which  the  actual 

A 

observed  value  Y.  differs  from  the  fitted  value  Y.  and  can  be  written 

i  i 

as  e.  =  Y..  -  Y..,  If  the  learning  model  chosen  is  not  correct,  then 
the  residuals  contain  both  random  (variance  error)  and  systematic  (bias 
error)  components. 

R  .all  that  during  operational  tests,  repeat  observations  are 

not  taken  for  each  crew  across  trials.  However,  all  crews  are  observed 

at  each  consecutive  trial  and  are  assumed  to  be  similar  in  structure 

and  training  level.  Therefore,  several  crew  observations  at  the  same 

trial  t..  are  considered  repeat  points  in  the  data.  These  "repeats"  are 

2 

used  to  obtain  an  estimate  of  a  and  represents  a  measure  of  the  random 
error  between  crews.  As  a  consequence,  we  can  test  for  the  "goodness 
of  fit"  of  our  learning  model.  The  hypothesis  tested  (33,35)  can  be 
stated: 

Hq:  The  model  adequately  fits  the  data 
H1 :  The  model  does  not  fit  the  data 

The  test  involves  partitioning  the  error  or  residual  sum  of  squares  into 
the  following  two  components: 


SSp^  +  SS 


LOF 


(3-7) 
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where  SSp^  is  the  sum  of  squares  attributable  to  random  error  between 

crews  and  $SLQp  is  the  sum  of  squares  attributable  to  the  lack  of  fit  of 

2 

the  model.  The  pure  error  estimate  of  a  is  found  by  computing  the 
contribution  to  the  pure  error  sum  of  squares  from  the  ith  consecutive 
trial  when  there  are  at  least  two  observations,  such  that 


Y  Y  Y 

11*  12*”**  ln^  are  n^.  repeat  observations  at  t^ 


Yoi » Y90 » • •  • »  Y 


2n  are  ng  repeat  observations  at  t^ 


21  *  22 ' 

Y  Y  T 

kl*  k2*”‘*  knk  are  nk  repeat  observations  at  t^ 

The  total  sum  of  squares  for  pure  error  is  calculated  as  follows: 


SS 


PE 


m 

I 

1=1  y=l 


"i 


Y)2 


(3-8) 


where  m  is  the  number  of  distinct  levels  of  t, 

n..  Is  the  number  of  observations  at  trial  i, 

Y,|  is  a  single  observation,  and 
7  is  the  sample  mean  across  a  particular  trial. 

The  total  degrees  of  freedom  associated  with  the  total  sum  of  squares 
pure  error  is  computed  as  follows: 


total  degrees  of  freedom 


K 

I  n{  •  K  s  n 
(=i 1  e 


The  sum  of  squares  for  lack  of  fit  Is  computed  by  subtraction 


■  •  •  rtisv  -v-vV, :» i* 


r 
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with  n-2-  nfi  degrees  of  freedom,  where  n  Is  the  total  number  of 
observations  (35).  The  mean  square  for  pure  error  Is 


m  ni 


7x2 


MS 


PE 


.  !!pe  ,  Ji  J,  (Y'» '  n 

ne 


i\ 

,1,  "*  • 


and  Is  an  estimate  of  a  . 

The  pure  error  sum  of  squares  Is  Introduced  Into  the  analysis 

MS 


of  variance  procedure  and  the  F-ratio  is  computed.  This  ratio,  F  = 


LOF 


MS 


is  compared  with  the  I00(l-a)%  point  of  an  F-distribution  with  (n-n  ) 

G 


PE 


and  ng  degrees  of  freedom  if  the  normality  assumption  is  satisfied.  If 
the  ratio  is 

(1)  Significant,  this  indicates  that  the  model  appears  to  be 
inadequate.  Attempts  would  be  made  to  discover  where  and  how  the 
inadequacy  occurs. 

(2)  Not  significant,  this  indicates  that  there  appears  to  be 
no  reason  to  doubt  the  adequacy  of  the  model  and  both  pure  error  and 
lack  of  fit  mean  squares  can  be  pooled  and  used  as  estimates  of  a  (33) 

The  usual  tests  wh'ch  are  appropriate  in  the  linear  model  case 
are,  in  general,  not  appropriate  when  the  model  is  nonlinear  (33).  As 
a  practical  procedure  we  can  compare  the  unexplained  variation  with  an 

p 

estimate  of  V(Y  )  =  a  but  cannot  use  the  F-statistic  to  obtain  conclu- 
y 

sions  at  any  stated  level.  In  the  absence  of  exact  results  for  the 
nonlinear  models,  we  can  regard  this  sum  of  squares  as  being  based  on  the 


■  A 
■ j 


r,j^-  ■  i:-'v  ;■ 


35 


total  degrees  of  freedom  for  residuals/error.  In  the  nonlinear  case 

this  does  not  In  general,  lead  to  an  unbiased  estimate  of  a  as  In  the 

linear  case,  even  when  the  model  Is  correct. 

2 

A  pure  error  estimate  of  a  can  be  obtained  from  the  repeat 

observations  as  discussed  earlier.  This  provides  a  sum  of  squares  (SSpE) 

with  n  degrees  of  freedom.  An  approximate  idea  of  possible  lack  of  fit 
6 

can  be  obtained  by  evaluating  SSE  -  SSpE  =  SS^p  and  comparing  mean 
squares. 


SS 


MS 


LOF 


SS 


LOF 


n-n 


and  MS 


PE 


PE 


Draper  and  Smith  state  that  an  F-test  Is  not  applicable  here  but  that 
we  can  use  the  value  from  the  table  (with  the  appropriate  degrees  of 
freedom)  as  a  measure  of  comparison.  From  this  we  would  get  some 
approximate  idea  of  how  well  the  learning  model  fits.  Measures  of  non¬ 
linearity  suggested  by  E.M.L.  Beale  (36,37)  can  be  used  to  help  decide 
when  linearized  results  provide  acceptable  approximations,  but  they  are 
not  used  for  this  study. 

Since  residuals  are  measures  of  the  error  component,  the  assump 
tions  made  concerning  the  selected  model  and  an  assessment  of  model 
adequacy  can  be  evaluated  through  a  direct  examlnatlcn  of  residuals. 

Recall  that  residuals  e.. ,  i  =  1 ,  2 . n  represent  the  deviation  of 

the  observations  after  the  regression  line  has  been  fit  and  can  be 

A  A 

expressed  e^  =  Y.  -  Y.  where  Y^  is  an  observation  and  Y^  is  the  corres¬ 
ponding  fitted  value  obtained  by  use  of  the  fitted  regression  equation 
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(33).  From  this  definition,  the  residuals  e^  are  the  differences  between 
what  Is  actually  observed,  and  what  is  predicted  by  the  regression 
equation.  That  is,  the  amount  which  the  regression  equation  has  not  been 
able  to  explain  or  the  observed  errors  if  the  model  is  correct. 

The  usual  assumptions  are  that  the  errors  are  independent 
(uncorrelated),  have  zero  mean,  and  a  constant  variance,  cr  .  If  in 
fact,  the  errors  in  the  sample  data  follow  a  normal  distribution,  the 
F-test  can  be  made.  Through  a  direct  examination  of  the  residuals  we 
can  conclude  either  (1)  the  assumptions  appear  to  be  violated  or  (2) 
the  assumptions  do  not  appear  to  be  violated.  This  direct  examination 
will  be  done  by  plotting  the  residuals  (1)  overall,  (2)  in  time  sequence, 
and  (3)  constructing  histograms  of  the  residuals.  If  the  learning 
model  is  r  rrect  the  residuals  should  resemble  observations  from  a 
normal  distribution  with  zero  mean.  The  patterns  of  the  plotted  residuals 
will  also  give  indications  about  homogeneity  of  variances,  abnormality, 
and  an  indication  of  possible  outliers  -  unusual  points  in  the  data 
that  are  far  greater  than  the  rest  in  absolute  value,  and  perhaps  lies 
three  or  four  standard  deviations  or  further  from  the  mean  of  the 
residuals.  Th<>  errors  may  be  linked  to  equipment  failures  or  errors 
in  recording  the  observations  and  should  be  obtained  from  background 
information  concer>i‘"i  the  various  test  reports. 

To  deters  ,.e  if  the  residuals  are  independent,  an  estimate  of 
their  autocorrelation  function  is  obtained  and  examined.  An  estimate  of 
autocorrelation  coefficient  at  a  particular  lag  is  computed  using  the 
following  expression: 
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nU) 


1 

N-p-1 


N-i, 

.yvt  -  Y'HYt+i  -  ?> 


where  N  equals  number  of  residuals,  Y^  Is  the  computed  residual  at  trial 
_  2 

t,  z  equals  lag,  Y  is  the  sample  mean  and  S  is  an  estimate  of  the 
variance. 
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CHAPTER  IV 

DATA  ANALYSIS 


The  first  major  task  In  this  research  study  was  that  of  data 
collection.  Although  OTEA  was  the  primary  source  of  data,  other  Army 
agencies  in  the  training  analysis  area  were  also  contacted.  These 
include,  the  Army  Research  Institute  (ARI),  Training  Development  Division/ 
System  Analysis  Branch  of  the  Infantry  School,  The  Infantry  Board 
(USAIB ) ,  and  the  TRADOC  Combined  Arms  Training  Agency  (TCATA).  OTEA 
provided  operational  test  reports  or  extracts  concerning  data  relating 
to  performance/learning  in  past  tests,  and  made  available,  knowledge¬ 
able  personnel  to  provide  background  information  where  possible. 

Due  to  the  nature  of  the  study,  there  were  limitations  placed 
on  the  characteristics  of  the  data  required.  The  limitations  are  listed 
below: 

1.  Data  had  to  come  from  an  operational  testing  environment. 

2.  Tests  conducted  should  involve  team/crew  tasks  and 
performance  objectives. 

3.  Criterion  or  measures  of  effectiveness  must  be  applicable 
to  team/crew  tasks  within  the  context  of  group  or  team 
definitions  as  discussed  in  Chapter  II. 

4.  Test  reports  must  provide  a  means  of  tracking  a  team/crew 
from  start  to  finish.  That  is,  performance  measured  over 
time  or  consecutive  trials. 

5.  When  applicable,  test  reports  should  provide  some  insight 
into  the  background  infromation  concerning  the  data  relevant 
to  this  study,  such  as  measurement  error  and  conditions  that 
may  have  affected  the  test  results  (“noise"  in  the  data). 
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It  became  apparent  from  the  outset  that  little  empirical  data 
was  available  In  the  context  mentioned  above.  Factors  affecting  the 
availability  of  data  were: 

1.  The  cost  is  prohibitive  or  infeasible  to  conduct  more  than 
one  or  twc  trials  In  seme  data  collection  efforts. 

2.  Crew  or  group  membership  changes  rapidly  and  significantly 
affects  the  results. 

3.  In  some  cases  where  test  reports  were  selected,  adequate 
information  was  not  available  to  trace  a  particular  crew 
from  start  ot  finish.  Therefore,  changes  in  performance 
could  not  be  adequately  established  or  inferred. 

Descriptions  of  the  data  collected  and  their  analysis  will  be  discussed 

in  the  following  sections.  Table  4-1  lists  each  sample  data  set  and  its 

origin. 


Table  4-1 .  Data  Base 


Title 

Origin 

Improved  Tow  Vehicle  (ITV)  (38) 

OTEA 

Dragon  (39) 

OTEA 

REALTRAIN  Validation  with 

Combat  Units  in  Europe  (40) 

ARI 

REALTRAIN  Validation  for 

Rifle  Squads  (41) 

ARI 

Project  Stalk  (42) 

Lightweight  Company  Mortar 

System  (OTI)  (43) 

OTEA 

Team  Training 

(Experiment  VIII)  (44) 

NAVTRADEVCEN 
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Improved  Tow  Vehicle  (ITV) 

The  ITV  operational  test  was  conducted  to  compare  four  systems 
with  each  having  six  dedicated  gun  crews  with  alternates.  The  gunners 
tracked  targets  over  four  range  bands  which  Included  two  target  profiles. 
All  gunners  were  trained  and  ranked  on  a  baseline  system  prior  to  al lo¬ 
cation  to  separate  systems.  Additionally,  contractor  training  was 
conducted  for  gunners  assigned  to  the  new  system.  A  summarized 
description  is  provided  below: 

1.  Performance  measure  -  Root  mean  Square  Error  (RMS) 

2.  Characteristics 

(a)  Four  systems 

(b)  24  primary  gunners 

(c)  5  gunners 

(d)  Approximately  12  to  16  trials  per  gun  crew  with  a 
total  of  1760  observations 

(e)  Type  of  activity  -  tracking 

It  should  be  noted  that  in  the  context  of  the  definition  of 
group/team  learning  tasks,  tKe  performance  measure  (RMS)  analyzed  does 
not  reflect  a  team  measure  of  effectiveness.  However,  since  this  was 
the  initial  data  sample  received  and  thought  to  contain  detectable 
learning,  an  analysis  was  still  performed. 

In  the  Initial  analysis  of  the  ITV  data  sample  it  was  felt  that 
there  might  be  some  effect  on  the  data  due  to  specific  combinations  of 
range  and  target  profile  (evasive  maneuvers).  Therefore,  an  analysis  was 
conducted  to  determine  if  some  adjustment  was  required  for  these  effects. 
All  possible  combinations  (8)  of  range  and  target  profile  were  computed 
and  a  linear  regression  procedure  was  performed  to  estimate  which  com¬ 
bination  should  be  adjusted.  The  results  of  the  regression  procedure 
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indicated  that  while  the  overall  regression  appeared  to  be  significant 
at  the  5  percent  level,  the  confidence  intervals  around  the  parameter 
estimates  Included  zero  and  it  was  concluded  that  no  specific  combination 
of  range  and  target  profile  had  a  significant  effect.  Therefore,  no 
adjustment  procedure  was  employed  and  the  Iterative  analysis  procedure 
was  initiated. 

Twenty-four  (24)  individual  gun  crew  data  polts  were  made  to 
determine  if  a  discernible  pattern  indicated  learning  over  consecutive 
trials.  The  majority  of  the  plots  do  not  indicate  such  a  pattern  and 
there  were  only  a  few  rare  cases  in  which  some  slight  indication  of 
learning  could  be  detected.  Representative  plots  are  shown  in  Figures 
A-l  through  A-6.  In  addition  24  plots  of  the  linear  regression  line 
with  a  95  percent  confidence  interval  were  made  and  they  depicted  similar 
results. 

An  aggregate  data  sample  for  each  system  was  developed  using  the 
average  response  for  the  crews  at  each  trial.  Fitting  the  linear  model 
in  step  2  of  the  iterative  procedure  shows  the  following  results  for 
the  four  systems  analyzed. 


System  A 


Sum  of 

Mean 

Source 

d.f . 

Squares 

Square 

Regression 

1 

.00157 

.00157 

Residual 

60 

.04771 

.0007952 

F-ratio 


_  .00157 
.00080 


1.974 
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When  compared  to  the  F-dlstrlbutlon  value  for  1  and  60  degrees  of 
freedom  at  the  5  percent  level,  there  Is  no  evidence  to  reject  that 
01  a  0.  The  confidence  Interval  around  0-j  Includes  zero  and  It  appears 
that  learning  cannot  be  detected. 


System  B 


Sum  of 

Mean 

Source 

d.f. 

Squares 

Square 

Regression 

1 

.00926 

> 

.  l>092 o 

Residual 

55 

.03370 

.00061 

1 

F-ratio  -iSSff =  15*11089 

♦Significant  at  the  5  percent  level 


For  the  System  B,  the  confidence  interval  around  0^  doe  not  include 

A 

zero  and  0^  =  .004388  which  indicates  that  there  is  detectable  learning. 


System  C 


Sum  of 

Mean 

Source 

d.f. 

Squares 

Square 

Regression 

1 

.00029 

.00029 

Residual 

75 

.10931 

.00146 

F-''atio  *  tbstH  -  -19907 

♦Not  significant  at  5  percent  level 
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System  D 


Sum  of 

Mean 

Source 

d.f . 

Squares 

Square 

Regression 

1 

.00163 

. 001 63 

Residual 

89 

.02448 

.00028 

F-rati0  -imik- 5-9146 

♦Significant  at  the  5  percent  level 


Systems  B  and  0  appear  to  have  detectable  learning  while  systems 
A  and  C  did  not.  Since  system  B  appears  to  have  the  largest  F-ratio 
and  slope  estimate,  the  aggregate  data  sample  was  modified  to  use  the 
individual  crew  response  at  each  trial.  This  was  done  to  provide  an 
estimate  of  the  lack  of  fit  when  the  nonlinear  models  were  fit  In  step 
3  of  the  iterative  procedure.  The  results  of  fitting  the  nonlinear 
models  are  shown  in  Table  4-2.  The  exponential  model  Y  *  ae*3*  where 
a  =  .040708,  b  =  -.009424  and  the  power  function  V  *  at”*3  where  a  = 
.047369  and  b  =  .13539,  appear  to  provide  an  adequate  fit  to  the  sample 
data. 

Since  the  performance  metsure  actually  represents  an  individual 
measure  of  effectiveness  further  anlaysis  was  not  undertaken. 

Dragon 

An  operational  test  on  the  dragon  weapon's  system  was  conducted 
by  OTEA  using  32  gun  crews.  Gun  crews  tracked  and  fired  on  targets  at 
various  range  bands.  E.^h  crew  was  observed  over  15-20  consecutive 
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Table  4-2.  Comparative  Results  for  Fitted  Models 
(System  3( ITV) ) 


Model 

SSE 

SW 

SSR 

Lack  of 
Fit 
Ratio 

■M 

19 

II 

<>- 

.29665006 

.04937006 

.336350 

.78967 

Y  =  aebt 

.2953304 

.04775 

.3375696 

.7637638 

Y  •  aeb/t 

.3015603 

.0542803 

.3313397 

.8683320 

Y  =  a[3+(l-8)tb] 

.30302805 

.05574805 

.32987195 

.89169 

Y  =  ala1"1)  +  6 

.30302805 

.05574805 

.32987195 

.89169 

Y  =  at"b  +  c 

.29606377 

.04878377 

.336836 

.78029 

»  ■&*« 

.29557716 

.0482972 

.33732284 

.77251 

SSpE  =  .24728 
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trials.  A  summarized  description  Is  provided  below. 

1.  Performance  Measure  -  Time  components  (seconds) 

(a)  Identification  of  target  to  launch  (T2) 

(b)  Time  between  target  hit  and  disposal  of  used  round  (T*) 

2.  Characteristics 

(a)  32  gun  crews 

(b)  Type  of  activity  -  tracking 

The  two  time  components,  T2  and  T4,  were  both  plotted  against  consecutive 
tlrals.  The  graphical  representations  show  no  discernible  learning 
patterns  in  the  data.  Representative  plots  are  shown  in  Figures  A-7 
through  A-9.  Furthermore,  the  linear  regression  shows  that  the  slope 
(B^)  of  the  regression  line  Is  essentially  zero. 


T2  Aggregate 


Sum  of 

Mean 

Source 

d.f . 

Squares 

Squares 

Regression 

1 

35.29683 

35.29688 

Residuals 

166 

253  026.55431 

1584.449/32 

35.29688  _ 

F~ratio  “  1 584749735' 

.02228 

♦not  significant 

at  5 

percent  level 

T4  Aggregate 

Source 

d.f. 

Sum  of 
Squares 

Mean 

Squares 

Regression 

1 

3.83857 

3.83857 

Residuals 

I5S 

9285.15548 

55.93467 

F’rat1°  *  55. 9346/  *  06863 

♦not  significant  at  5  percent  level 
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Since  the  Dragon  sample  data  fails  to  meet  the  suitability  criteria 
during  the  screening  process,  no  further  analysis  is  performed. 

REALTRAIN  Validation  with  Combat  Units  in  Europe 
The  REALTRAIN  exercise  provided  a  two-sided,  free-play  situation 
for  infantry  and  armor  units  in  a  simulated  tactical  environment.  It 
provided  for  a  sequential  record  of  events  during  each  engagement  which 
included  an  assessment  of  casualties.  A  summarized  description  is 
provided  below. 

1.  Performance  measure  -  Casualty  rate 

2.  Characteristics 

(a)  Two  teams  (conventional  training  vs  REALTRAIN  methods) 

(b)  Each  team  consisted  of 

(1)  Tank  Platoon 

(2)  Two  Infantry  Squads 

(3)  Tow  Section 

This  sample  was  deemed  inappropriate  because  it  contained  consolidated 
data  over  two  trials.  That  is,  the  exercise  was  run  over  two  or  three 
phases  and  all  observations  were  averaged  together  and  displayed  in 
graphical  form.  Raw  data  for  each  unit  was  not  available.  Since  our 
learning  models  contain  at  least  two  unknown  parameters,  further  analy 
sis  would  be  misleading. 

REALTRAIN  Validation  for  Rifle  Squads 
This  REALTRAIN  exercise  provided  a  two-sided,  free-play  situation 
for  18  rifle  squads.  Nine  squads  were  trained  using  REALTRAIN  techniques 
and  the  other  nine  squads  were  trained  using  conventional  techniques. 

The  rifle  squads  were  pitted  against  each  other  (REALTRAIN  vs 
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Conventional)  in  a  simulated  tactical  environment.  An  assessment  of 
the  casualty  rate  (sustained  vs  Inflicted)  was  recorded  during  each 
engagement.  A  summarized  description  is  provided  below. 

1.  Performance  Measure  -  Casualty  rate  (sustained  vs  Inflicted) 

2.  Characteristics 

(a)  Two  training  methods  -  Conventional  vs  REALTRAIM 

(b)  18  rifle  squads 

(c)  9  squads/training  method 

(d)  Type  of  Test  -  Tactical  Exercise 

Observations  for  all  squads  were  averaged  and  displayed  graphically. 
Only  two  phases  (trials)  of  the  exercise  were  conducted.  Therefore, 
it  was  also  concluded  that  this  data  sample  was  inappropriate  for 
analysis. 


Project  Stalk 

Twenty-five  tank  crews  operating  under  conditions  of  competitive 
stress  and  rigidly  uniform  training  were  timed  in  their  performance 
at  hitting  a  stationary  target  which  appeared  suddenly  as  a  result  of 
the  travel  of  their  tank.  Eleven  different  conditions  of  tank  and  fire 
control  conditions  were  run  by  each  of  the  twenty-five  crews  participating 
in  the  test.  Crews  were  given  instructions  to  obtain  a  target  hit  in 
a  minimum  time.  Crews  were  timed  in  their  speed  at  recognizing  the 
target,  loading  the  round,  laying  the  gun,  etc.,  until  a  hit  was 
obtained.  Two  typed  of  test  courses  were  used.  On  the  first  type, 
range  and  characteristics  of  the  target  and  tank  positions  were  repeat¬ 
edly  observed  by  the  crews.  On  the  second  course  none  of  these  factors 
were  known  by  the  crews.  The  experimental  design  was  such  that  factors 
related  to  differences  in  training,  testing  conditions,  and  crew 
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proficiency  could  be  accounted  for  when  comparing  the  performance  of 
the  five  tanks.  A  summarized  description  is  shown  below. 

1.  Performance  Measure  -  Time  of  detection  to  hit  on  target 

2.  Characteristics 

(a)  Twenty-five  crews 

(b)  Five  types  of  tanks  used 

(c)  Each  crew  was  trained  on  a  tank  immediately  prior  to 
firing  it. 

(d)  Type  of  activity  -  Tank  gunnery 

Data  for  sixteen  of  the  twenty-five  crews  were  used  because  it 

was  felt  that  this  provided  an  adequate  number  of  degrees  of  freedom 

and  the  addition  of  the  others  would  only  marginally  affect  the  results. 

In  addition,  because  of  the  time  required  to  extract  the  data  from  the 

test  reports,  it  appeared  that  the  sixteen  crews  selected  adequately 

represented  the  data  sample.  Background  information  indicated  no 

rank-order  performance  in  assigning  tank  crews  to  the  five  platoons. 

Therefore,  the  selection  of  the  16  crews  did  not  appear  to  perpetuate 

any  bias  effect  in  the  analysis.  Each  crew  was  trained  under  rigidly 

uniform  conditions  and  given  the  same  instructions  during  the  conduct. 

of  the  test.  Background  information  also  reveals  that 

The  crew  differences  in  recognition  time  are  similar  to  crew 
differences  observed  for  other  operations  and  exhibit  the 
normal  spread  of  proficiency  attainment  of  human  beings.  It 
has  been  observed  that,  whatever  the  ultimate  cause  of  crew 
differences  in  recognition  time,  they  were  appreciable  and 
reasonably  constant....  The  correlation  coefficient  between 
the  average  recognition  time  of  each  of  the  individual  crews 
on  the  Test  Course  targets  and  the  average  recognition  time 
of  the  corresponding  crews  on  Training  Test  Courses  targets 
is  indicative  of  the  crew  consistency.  (43) 

Data  was  plotted  for  the  sixteen  crews  and  the  patterns  of  the  plots 

showed  significant  learning  (see  Figures  A-10  through  A- 1 7 ) . 
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Background  information  revealed  that  the  recognition  to  hit  time 
reflected  the  reduced  times  to  perform  the  individual  operations  with 
training  by  decreasing  from  an  average  for  the  four  non-transfer 
targets  on  the  Te^t  Course  of  66.4  seconds  for  Phase  I  to  33.1  seconds 
in  the  final  phase  (43).  Only  observations  for  non-transfer  targets 
were  used  because  target  4  in  the  Test  Traininc  Course  (TTC)  and  target 
5  in  the  Test  Course  (TC)  required  the  unloading  and  reloading  of 
another  round  in  the  gun.  For  example,  in  the  former  case,  target  3 
required  AP  (antipersonnel)  ammunition  and  the  gun  is  immediately 
reloaded  upon  firing  a  round  at  any  target  in  anticipation  of  another 
being  required.  After  getting  a  hit  on  target  3,  the  loader  had  to 
unload  the  AP  round  and  store  it,  then  load  the  proper  HE  (high 
explosive)  round  for  target  4.  This  procedure  resulted  in  a  longer 
first  round  load  time  by  about  20  seconds  more  than  was  required  at  other 
targets  (43). 

The  times  to  achieve  a  target  hit  were  found  to  decrease  markedly 
with  crew  training.  Although  the  hitting  probabilities  were  found 
not  to  increase  with  training,  the  time  to  load  the  rounds  and  lay  the 
gun  decreased  greatly  with  the  training  given  the  crews  during  the 
test. 

Two  aggregate  data  sets  for  both  the  Test  Training  Course  (TTC) 
and  the  Test  Course  (TC)  were  developed  by  combining  the  data  for  the 
16  crews  across  the  four  non-transfer  targets  and  the  eleven  conditions 
for  each  target.  This  provided  a  method  of  tracking  the  crew  performances 
throughout  the  test  according  to  the  Greco-Latin  test  design  used. 

The  TTC  data  consisted  of  678  observations  and  the  TC  data  consisted  of 
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674  observations  over  44  trials.  When  the  linear  model  was  fit  to  both 
data  sets  In  step  2  of  the  screening  process,  the  following  results  were 
Indicated. 


TTC 


Sum  of 

Mean 

Source 

d.f . 

Squares 

Square 

Regression 

1 

• 

87726.475 

87726.475 

Residuals 

676 

2827995.42068 

4183.425 

F- ratio  = 

87726.475  _  „n 
4183.425 

TC 

Sum  of 

Mean 

Source 

d.f. 

Squares 

Square 

Regression 

1 

82522.39281 

82522.39281 

Residuals 

672 

2440873.25556 

3632.259308 

F-ratio 

82522.39281  „o 
3632.259308 

.719 

When  compared  to  the  F-distrlbutlon  value  for  the  appropriate 
degrees  of  freedom  at  the  5  percent  level,  there  was  evidence  to  reject 
that  g-j  =  0.  The  confidence  intervals  around  0-j  for  both  data  sets  did 
not  include  zero.  Since  the  estimates  of  01  were  both  negative,  there 
was  an  indication  that  learning  was  occurring. 

Both  data  sets  satisfied  the  suitability  criteria  specified  in 
the  screening  process;  therefore,  the  nonlinear  learning  models  listed 
in  Table  3-1  were  fit  to  the  data. 
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Initially  three  models  were  fit. 

(1)  Y  =  at'b 

(2)  Y  =  aebt 

(3)  Y=aeb/t 

First  analyze  the  Test  Training  Course  data.  Parameter  estimates 
and  a  residual  sum  of  squares  were  obtained  by  usi.tg  the  SPSS  Nonlinear 
Subprogram. 

11)  Y  =  at"b  where  a  =  86.13708  b  =  -.173043  SS£  =  2851060.4 

(2)  Y  =  aebt  where  a  =  77.2504  b  =  -.01792  SS£  =  2822300.5 

(3)  Y  *  aeb/t  where  a  =  51.51  b=.31028  SS£  =  2906957.3 

To  obtain  an  approximate  idea  of  the  lack  of  fit  of  the  models,  a  pure 
error  estimate  of  o2  was  computed  as  discussed  in  Chapter  III  by  using 
the  16  crew  observations  over  each  trial. 

44  ni  _  ? 

SSDC  7  (Y.  -  i)c  -  2339080.18552 

PE  1-1  m-1 

Since  SS^  =  SSp^  +  SS^p,  the  sum  of  squares  for  lack  of  fit  was  obtained 

~  -h  4 

by  subtraction.  Using  the  model  Y  =  at”  , 

SSL0F  =  SSE  ’  SSPE  =  20851 060’ 4  -  2339080.18552 
=  511980.214 

A  lack  of  fit  ratio  was  obtained  by  comparing  the  mean  squares. 
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mslof 


^LOF  .  511980.214  =  12190.00512 
n-ne  42 

SSPE  _  2339080.18552  =  3689  40Q9 

n  ~  634 

e 


12190.00512 
Lack  of  Fit  ratio  =  3539 .'400r 


3.304 


The  lack  of  fit  ratios  for  (2)  and  (3)  are  shown  in  Table  4-3.  To 
further  test  the  model  for  adequacy,  a  direct  examination  at  residuals 
was  conducted.  Figure  A-18  shows  an  overall  plot  of  the  average  resi¬ 
duals  across  the  44  trials  for  the  16  crews.  By  visual  inspection  it 
appeared  that  the  average  residuals  at  trials  1,  4,  and  42  were  atypical 
of  the  others.  The  majority  of  the  individual  residuals  appeared  to  be 
±3  standard  deviations  from  the  mean  of  the  residuals  at  those  trials. 
Even  though  there  were  one  or  two  residuals  which  did  not  exceed  the 
criteria,  it  was  concluded  that  the  removal  of  the  entire  set  of  obser¬ 
vations  would  not  adversely  affect  the  analysis.  The  model  Y  -  at"*5 
appears  to  fit  the  data  and  is  selected  as  the  "best"  model.  Even 
though  De  Jong's  model  and  Y  =  at~D  +  c  appear  to  have  a  somewhat 
smaller  lack  of  fit  ratio  with  corresponding  larger  SS  regression,  the 

•*>  —  k 

power  function  (Y  =  at  )  is  selected  due  to  parsimony.  That  is,  it 
has  fewer  parameters  and  does  not  appear  to  be  significantly  different 
from  the  model  Y  =  at-*5  where  a  =  104.595  and  b  =  -.26492. 

After  fitting  and  selecting  the  "best"  model  we  must  further 

^  9  n  ri  Jri  cti 

examine  its  adequacy.  We  compute  the  residuals  e.  =  Y  -  Y.  ’  ' 

J  J  J 
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Table  4-3.  Comparative  Results  for  Fitted  Models  (TIC) 


Model 

SS£ 

SSL0F 

SSR 

Lack  of 

Fit  Ratio 

Y  = 

afb 

2851060.4 

511980.214 

1991541.85 

3.304 

Y  = 

aebt 

2822300.5 

483220.31  4 

2020301.75 

3.119 

Y  = 

aeb/t 

2906057.3 

5S7877 .114 

1935644.95 

3.665 

InY 

=  lna-blnt 

374.1706 

73.8112 

12.50802 

3.710 

A 

InY 

=  lna+bt 

369.6397 

69.28C3 

17.03892 

3.48186 

InY 

=  lna+b/t 

384.5419 

84.13246 

2.13672 

4.23078 

SSpg.  =  2339080.1855  (Nonlinear  models) 
SSpj.  =  300.3594  (log  transform  models) 
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Table  4-4.  Comparative  Results  for  Fitted  Models  (TTC) 


(Adjusted  Data) 


Model 

ssE 

SSL0F 

SSR 

Lack 
of  Fit 
Ratio 

A 

Y  = 

at-b 

1527619.0 

166437.76 

1626287.0 

1.856 

Y  = 

aebt 

1529402.9 

168221.66 

1624503.1 

1.876 

A 

Y  = 

aeWt 

1545537.8 

184356.374 

1608368.2 

2.06 

A 

Y  = 

t+b  +  c 

1534004.0 

172822.575 

1519902.0 

1.927 

A 

Y  = 

a[g+(l-8)t'b] 

1525856.8 

164675.375 

1 628049.2 

1.336 

A 

Y  = 

at"b  +  c 

1526337.5 

165156.025 

1627568.5 

1.842 

Y  = 

a(a^'  ^  )+  0 

1597436.3 

266255.06 

1556469.7 

2.635 

A 

InY 

=  Ina-blnt 

310.0763 

47.767 

19.97871 

2.765 

InY 

=  lna+bt 

308.7863 

46.4774 

21.269 

2.690 

A 

InY 

=  lna  +  b/t 

314.2337 

51 .925 

15.82134 

3.005 

InY 

=  a'  +  b't 

.32069 

.04804 

.76351 

2.675 

SSpE  =  1361181.24  (Nonlinear  models) 

SSpE  =  262.30890  (Log  transform  models) 

SSpE  =  .27265  (other) 

NOTE:  Atypical  points  at  trials  1,  6,  42  removed. 
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estimate  and  examine  their  autocorrelation  function.  The  sample  auto- 

A 

correlation  function  of  the  residuals  Is  denoted  by  {p^e)}  (46).  Again, 
the  average  residual  across  each  trial  Is  used.  Rather  than  consider 
the  pk  (e)'s  Individually,  we  obtained  an  Indication  of  whether  the  first 
11  residual  autocorrelations  considered  together  Indicate  adequacy  of 
the  model.  As  a  general  rule  k  lag  coefficients  are  examined  where 
k  <  N/4.  This  estimate  Is  obtained  through  an  approximate  Chi-square 
test  for  model  adequacy. 


*1 (e)  -  .02758 

p2(e)  =  -.38909 
P3(e)  =  -.02111 
p4(e)  =  .38570 

P5(e)  =  -.34704 

A 

P11  ! 


P6(e)  =  -.38102 
P7(e)  =  -.03358 
P8(e)  *  .37201 

p9(e)  =  -.16558 
P10(e)  =  -.22597 
02670 


Approximate  Chi-square  statistic 
Q  s  (N )  l  p^(e) 

kr.l  K 


k  =  11  lags 

Test  Statistic  Q  =  34.57047 


Comparing  Q  with  a  5  percent  value  chi-square  variable  w/43  degrees 
of  freedom,  we  find  Xq  q5  43  s 59.34.  We  conclude  that  there  Is  no 

strong  evidence  to  reject  the  model . 

*  . 

For  the  model  Y  =  104.595  t  ’  Figure  A-19  shows  a  plot  of 
the  residuals  for  each  observation  and  they  appear  to  come  from  an 
approximate  "peaked-normaV  distribution.  Figure  A-20  shows  a  plot 
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2 

of  the  estimates  of  a  at  each  trial  (MS-  )  and  they  tend  to  level  off 
after  the  1 6th  tri al . 

The  nonlinear  models  fit  to  the  Test  Course  data  provided  the 

results  shown  in  Table  4-5  for  674  observations  over  44  trials.  An 

overall  plot  of  the  average  residuals  Indicated  that  there  were  some 

atypical  points  in  the  data  sample.  Atypical  points  were  determined 

by  background  data  which  indicated  that  factors  extraneous  to  the  test 

considerations  had  exerted  undue  influence.  Additionally  residuals 

were  judged  to  be  atypical  if  they  were  ±3  standard  deviations  from  the 

mean  of  the  residuals  at  a  specific  trial.  A  total  of  82  observations 

were  removed  from  the  original  aggregate  data  set.  An  adjusted  data 

set  was  refit  after  removing  atypical  points  at  a  specific  trial.  The 

results  shown  in  Table  4-6  indicate  that  the  estimate  of  the  lack  of 

A  bt 

fit  improved  slightly  for  the  exponential  model  Y  =  ae  while  the 
fit  for  the  others  appeared  to  get  worse  with  the  exception  of  De  Jonq's 
model,  Y  =  a[$  +  ( 1 -3 ) t  ].  It  is  also  noted  that  the  lack  of  fit 
ratios  were  twice  as  large  in  the  adjusted  TC  data  as  compared  to  the 
TTC  data.  It  appears  that  while  learning  was  occurring,  the  "noise"  or 
extraneous  factors  prevent  the  fitting  of  a  smooth  curve  to  the  data. 
Those  factors  can  be  attributable  to  circumstances  such  as  multiple 
misfires,  mechanical  or  firing  system  failures,  and  where  ammunition  had 
to  be  drawn  from  storage  wells.  It  is  noted  that  a  multi -parameter 
polynomial  model  may  have  fit  the  data  but  it  was  intuitive  that  a 
learning  curve  would  be  a  smooth  curve  rather  than  a  "zig-zag"  curve 
in  the  case  of  a  polynomial. 

The  parameter  estimates  for  the  two  test  courses  are  shown 
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Table  4-5.  Comparative  Results  for  Fitted  Models  (TC) 


Model 

SSE 

SSL0F 

SSR 

Lack  of 
Fit 
Ratio 

A 

Y  = 

at'b 

2468607.8 

493855.022 

2704131.2 

3.7513 

A 

Y  = 

aebt 

2440991.9 

466239.122 

2731747.1 

3.5415 

A 

Y  = 

aeb/t 

2514963.3 

540215.522 

2657770.7 

4.103 

A 

InY 

=  lna-blnt 

389.61005 

105.07183 

16.90632 

5.5391 

A 

InY 

=  lna  +  bt 

381.08081 

96.54264 

25.43555 

5.0895 

A 

InY 

=  lna  +  b/t 

4C4. 16331 

119.625 

2.35305 

6.3063 

SSpE  =  1974752.77787  (Nonlinear  model?) 
SSpE  =  284.53817  (Log  t.- :nsform  models) 
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Table  4-6.  Comparative  Results  for  Fitted  Models  (TC) 
_ (Adjusted  Data) _ 


Model 

ssE 

SSL0F 

SSR 

Lack  of 
Fit 
Ratio 

Y  =  at"b 

513771.03 

123782.0216 

1321367.97 

4.141 

V  =  aebt 

496887.26 

106398.2516 

1333251.74 

3.576 

V  =  aeb/t 

551335.23 

161 346.222 

1283803.77 

5.398 

y  =  -A_  +  c 

T  t+b  c 

547924.68 

157935.6716 

1287214.32 

5.284 

Y  =  a[p+(l-B)t"b] 

506392.74 

116943.7316 

1328206.26 

3.913 

Y  =  a(at_1)  +e 

562010.21 

172021.2016 

1273128.79 

5.755 

SSpE  =  389989.00838 


NOTE:  Atypical  points  removed  from  data. 
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below  for  both  the  power  function  and  the  exponential  models. 

TTC 


Y  * 
a  a 

Y  » 

a  « 


at 

104.595 

aebt 

74.1207 


TC 


Y 
a 

Y 


at"b 

76.3596 

sebt 


b  »  .26492 

b  «  -.019076 

b  *  .180306 


a  -  67.5696  b  >  -.017967 


A  comparison  Indicates  that  the  TTC  model  parameters  are  relatively 
larger  than  those  for  the  TC.  In  addition,  the  learning  factor  which 
is  represented  by  the  parameter  b,  appears  to  be  larger  for  the  Test 
Training  Course. 


Lightweight  Company  Mortar  System 
The  81  mm  Gunner's  examination  was  conducted  to  establish  base¬ 
line  data  to  use  in  comparing  the  81  mw  mortar  with  the  XM  224E1 
Lightweight  Company  Mortar  System.  The  ourpose  of  the  test  was  to 
establish  the  time  It  takes  to  set  up  and  perform  a  mortar  fire 
mission  and  to  refamlllarlze  the  test  crews  with  the  81  mm  mortar  so 
that  they  may  be  better  able  to  compare  It  with  the  XM  224E1.  A 
summarized  description  is  given  below. 

1.  Performance  Measure  -  Gunner's  Examination  Scores 
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2.  Characteristics 

(a)  Two  systems  tested 

(b)  3  mortar  squads 

(c)  Number  of  observations  4  >  81  mm  mortar 

3  -  XM224E1  mortar 

(d)  Type  of  activity  -  Performance  Test 

Seven  complete  gunner's  examination  were  performed  during  OTI; 
four  for  the  81  urn  mortar  and  three  for  the  XM  224E1/LWCMS.  The  latter 
was  not  analyzed,  even  though  there  appeared  to  be  learning  patterns 
In  the  data,  because  there  were  only  three  distinct  trials  and  since 
our  learning  models  contain  at  least  two  unknown  parameters,  further 
analysis  would  be  misleading.  However,  the  four  trials  for  the  81  mm 
mortar  data  were  analyzed.  At  each  trial  or  phase,  there  were  six  tasks 
performed : 

(1)  Mounting  the  mortar 

(2)  Small  deflection  and  elevation  change 

(3)  Referring  the  sight 

(4)  large  deflection  and  elevation  change 

(5)  Reciprocal  laging 

(6)  Manipulation  for  traversing 

A  plot  of  the  data  Is  snown  In  Figure  A-21 .  The  background  Information 
Indicates  that  the  Initial  times  required  to  perform  the  phases  of 
the  gunner's  examination  were  high  due  to  the  fact  that  the  test  platoon 
had  not  worked  with  ?»ortars  fur  several  weeks  and  their  level  of 
training  was  low.  Upon  completion  of  the  training  program,  times  to 
perform  the  phases  of  the  gunner's  examination  were  minimized.  (34) 
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The  plot  of  the  scores  over  consecutive  trials  (phases)  show  s 
discernible  pattern  which  Indicates  learning.  In  addition  when  the 
linear  model  was  fit  In  step  2,  the  following  results  were  Indicated. 


Sum  of 

Mean 

Source 

d.f. 

Squares 

Square 

Regression 

1 

15732.300 

15732.300 

Residuals 

22 

12830.200 

585.46364 

f_*> -  15732.30  _  nii  m 
a^10  “  5806364  "  26 -871 52 


When  compared  to  the  F-dlstribution  value  for  1  and  22  degrees  of 
freedom  at  the  5  percent  level,  there  Is  evidence  to  reject  that 
0-j  =  0.  Additionally,  the  estimate  of  the  negative  slope  *  -22.9) 
and  the  confidence  interval  around  0^  did  not  include  zero,  therefore 
the  sample  data  was  concluded  to  be  suitable  for  further  analysis. 

The  nonlinear  model  Y  =  at  ^  was  fit  and  the  results  are  shown 


below. 
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Y  =  115.139  t“*5837 


Source 

d.f. 

Sum  of 
Squares 

Mean 

Square 

Regression 

2 

154831 .0 

Residuals 

22 

13549.849 

(Lack  of  Fit) 

2 

2054.849 

1027.425 

(Pure  Error) 

20 

11495.0 

574.75 

Lack  of  Fit  ratio  =  - =  1.788 


Team  Training 

A r,  air  traffic  control  task  was  used  in  which  each  of  two  team¬ 
mates  portrjyed  a  "pattern  feeder"  whose  responsibility  it  was  to  guide 
aircraft  into  an  approach  gate  by  issuing  verbal  instructions  via  a 
simulated  radio  linked  to  the  aircraft  pilots.  Two  variables  were 
manipulated  in  Experiment  VIII:  work  load  (for  time  stress )  and  team 
arrangement.  Stress  is  defined  in  terms  of  the  required  approach  rate 
(system  criterion):  one  approach  every  2  minutes  for  low  stress,  and 
one  every  minute  for  high  stress.  Team  arrangement  was  defined  in 
terms  of  the  manner  in  which  the  two  teammates  coordinated,  in  order  to 
satisfy  the  system  criterion.  The  two  team  arrangements  used  were  termed 
reciprocal  and  nonreciprocel .  In  the  nonreciprocal  arrangement  the 
team  was  instructed  to  satisfy  the  low-stress  criterion  on  each  approach, 
independently  of  any  time  error  incurred  on  previous  approaches.  In 
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The  reciprocal  arrangement,  on  the  other  hand,  each  radar  controller 
(RC)  was  Instructed  to  compensate  for  any  time  error  which  may  have 
accrued  over  the  previous  approaches.  A  summarized  description  is 
presented  below. 

1.  Performance  Measure  -  Flight  Errors  by  all  groups  of 

Experiment  VIII 

2.  Characteristics 

(a)  4  groups 

(b)  Two  groups  used  reciprocal  arrangement  under  both 
high  and  low  stress  conditions 

(c)  Two  groups  used  nonreciprocal  arrangement  under  both 
high  and  low  stress  conditions 

(d)  Four  sessions  (trials)  for  each  group 

A  plot  of  data  from  Experiment  VIII  of  the  test  report  shows  the 
performance  measure,  mean  number  of  flight  errors  vs  sessions  (consecu¬ 
tive  trials).  The  graph  shows  patterns  which  appear  to  indicate  learning 
(see  FigureA-22).  The  linear  model  was  fit  in  step  2  of  the  iterative 
analysis  process  with  the  folloiwng  results. 


Sum  of 

Mean 

Source 

d.f . 

Squares 

Square 

Regression 

1 

784.37812 

784.37812 

Residuals 

14 

689.48125 

49.24866 

F-rati0  15-' 32689 
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When  compared  to  the  F-distribution  value  for  1  and  14  degrees  of 
freedom  at  the  5  percent  level,  there  is  evidence  to  reject  the  hypo¬ 
thesis  that  e-j  *  0.  Additionally,  the  estimate  of  the  slope  was 
negative  =  -6.2625)  and  the  confidence  interval  around  did  not 
include  zero,  therefore  the  sample  data  was  concluded  to  be  suitable 
for  further  analysis. 


below. 


nonlinear  model 

Y  =  at"b 

was  fit  and 

the  results 

Y 

=  25.3582 

t-l.  00391 

Source 

d.f . 

Sum  of 
Squares 

Mean 

Square 

Regression 

2 

3954.11 

Residuals 

14 

589.140 

(lack  of  fit) 

2 

1.6875 

0.72625 

(Pure  Error) 

12 

587.6875 

48.974 

TOTAL 

16 

4243.25 

Lack  of  Fit  ratio  =  =  .01483 
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CHAPTER  V 

CONCLUSIONS  AND  RECOMMENDATIONS 

Conclusions 

This  research  has  addressed  the  problem  of  determining  the  existence 
of  a  representative  group/crew  learning  curve  (or  set  of  curves)  and  the 
development  of  a  mathematical  description  of  this  curve  applicable  to 
training  levels  in  operational  testing.  Data  from  OTEA  test  reports  and 
data  made  available  through  other  training  and  training  analysis  agencies 
was  analyzed  using  an  iterative  procedure  to  determine  if  learning  patterns 
could  be  detected. 

A  screening  process  was  used  to  determine  the  suitability  of  data  for 
further  analysis,  after  which  learning  models  suggested  in  the  literature 
were  fit  to  the  screened  data  using  nonlinear  regression  techniques.  A 
comparison  of  the  fitted  models  was  conducted  by  comparing  the  Lack  of 
Fit  ratios  and  the  sum  of  squares  for  regression  computed  for  each  model. 

This  comparison  shows  that  the  following  models  appear  to  provide 
an  adequate  fit  to  the  data  analyzed. 

*  _  k 

(1)  Y  ~  at  The  power  function 

(2)  Y  =  a[p  +(l-$)t“b]  De  Jong's  model 

(3)  Y  =  at"b  +  c 

(4)  Y  =  aebt 

Since  the  variations  of  the  power  function,  models  (2)  and  (3)  did  not 
appear  to  provide  a  better  fit  to  the  data,  model  (1)  was  selected  from 
the  standpoint  of  parsimony  or  least  parameters.  In  addition,  it  cannot 
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be  stated  conclusively  that  model  (1)  provides  a  better  fit  than  model 
(4).  However,  based  on  a  survey  of  the  industrial  applications  of  the 
power  function  model  as  reported  in  the  literature.  It  was  concluded  that 
the  model  9  =  at~b  does  adequately  tit  the  empirical  data  analyzed  and 
can  be  used  as  a  representative  group/crew  learning  model  for  this  data. 

Limitations  of  the  Research 

This  research  has  been  limited  by  the  availability  of  adequate 
data  representing  several  different  crew  and  group  learning  situations. 
The  lack  of  a  larger  data  base  limited  the  analysis  to  a  small  number  of 
performance  measures.  These  included  tracking,  tank  gunnery  and  mortar 
examination  scores.  Since  the  analysis  of  a  large  number  of  data  sets 
involving  a  variety  of  crew  tasks  and  performance  measures  was  not 
possible,  this  study  concentrated  on  the  analysis  of  suggested  learning 
models  for  the  limited  data  available. 

Considerations  for  Test  Design 

Even  though  there  Is  a  limited  amount  of  data  available  in  the 
group/team  context  as  discussed  previously,  future  data  may  be  analyzed 
using  the  iterative  procedures  developed  in  Chapter  III.  However,  a 
review  of  the  literature  Indicates  that  the  following  considerations 
should  be  made  when  providing  input  for  the  design  of  operational  tests. 

1.  Insure  that  Individual  skill  competencies  are  acquired  prior 
to  engaging  in  team  training  or  testing.  A  consistent  finding 
was  that  individual  proficiency  has  been  shown  to  be  a  sig¬ 
nificant  factor  in  determining  team  performance  (24). 

Address  the  problem  involved  in  the  production  of  standardized 


2. 
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replicable  test  conditions,  and  the  establishment  of 
accepted  group/team  performance  criteria  by  defining  the 
tasks  characteristics  needed  to  identify  realistic  training 
objectives  (24).  These  particular  aspects  are  not  clearly 
defined  in  current  literature  but  objectives  are  outlined  in 
these  references  (24,47,48,49,  50). 

3.  Distinguish  between  organizational  type  tasks  and  mission 
type  tasks. 

4.  The  detection,  measurement,  and  recording  of  the  value  of  an 
observable  event  at  each  occurrence  (24).  Current  tests 
which  use  blocking  and  randomized  test  design  should  provide 
a  vehicle  for  recording  these  consecutive  occurrences  in 
addition  to  recording  the  cell  totals. 

E.  Assessment  of  learning  effects.  Procedures  developed  by 
Yealy  (51)  could  be  used  to  determine  rate  of  learning  at  a 
specific  trial  during  an  operational  test.  These  procedures 
could  be  employed  in  two  ways:  (a)  Conduct  Initial  stages 
of  test  in  a  sequential  fashion,  say,  for  the  first  three  trials 
to  determine  rate  of  learning  if  any.  If  the  rate  of  learning 
leveled  off,  then  the  participants  are  assumed  to  be  at  or 
approaching  a  fully  learned  state  and  the  test  could  continue 
with  learning  effects  considered  negligible.  On  the  other 
hand,  if  the  rate  of  learning  has  not  leveled  off,  then  the 
test  should  be  continued  in  a  sequential  fashion  until  learning 
effects  become  negligible.  However,  this  approach  appears  to 
be  too  costly  in  terms  of  manpower  and  resources.  An 
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alternative  approach  would  be, (b)  conduct  a  pretest  and 
determine  rata  of  learning  at  each  trial.  When  a  satisfactory 
level  of  learning  Is  reached  then  the  operational  test  could 
begin. 

6.  Avoid  where  possible  the  Inclusion  of  order  effects  in  the 
test  design  In  which  the  participants,  for  exampl e, 1  earn 
where  to  look  (learning  the  problem)  rather  than  learning 
how  to  operate  the  equipment  being  evaluated. 

Recommendations 

The  following  recommendations  for  future  research  are  made  as  a 

result  of  this  study.  One  recommendation  is  the  acquisition  and  analysis 

of  more  data  using  procedures  outlined  in  Chapter  III.  Since  this  study 

was  limited  by  the  nonavailability  of  a  large  number  of  adequate  data 

sets,  further  analysis  of  other  sample  data  could  be  used  to  verify 

results  obtained  in  the  study.  This  would  include  the  study  of  the 

a  ■■  b 

acequacy  of  the  power  function,  Y  =  at  vs  the  exponential  model 
A  bt 

Y  =  ae  since  both  models  appeared  to  fit  sample  data  analyzed  in  this 
study.  However,  it  coulJ  not  be  date mi nod  that  tho  two  models  wore 
statistically  different. 

Another  recommendation  involves  the  development  of  group/crew 
learning  curves  (or  set  of  curves)  for  specific  crews  or  units,  i.e., 
Artillery  battery,  rifle  squad,  etc.  Models  should  be  developed  on  the 
basic  research  level  to  consider  the  interaction  among  crew  members  and 
a  possible  comparison  of  the  performance  by  individuals  and  by  the  crew. 
This  should  be  done  because  it  appears  that  there  is  no  single  overall 
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true  model  for  all  group  learning.  It  Is  felt  that  since  military  teams 
or  units  are  structured  differently  and  have  Inherent  mission  capabilities, 
then  the  concept  of  an  overall  true  model  would  not  adequately  reflect 
these  differences. 


APPENDIX  A 


This  appendix  contains  representative 
plots  used  In  the  analysis  of  sample 
data  In  Chapter  IV. 


LEGEND 


ROOT  MEAN  square  error 


CONSECUTIVE  TEST  TRIRL 

Figure  A-4.  Gun  Crew  (3-13)  Data  with  Regression  Line  and  Confidence  Interval 


TRIAL  NUMBER 

Figure  A-5.  Plot  of  Data  for  ITV  Gun  Crew  (0-17) 
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Figure  A-7  Plot  of  Data  for  Dragon  Gun  Crew  0303(T2) 
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Figure  A-9.  Plot  of  Data  for  Draion  jun  Crow  231 3(T4) 
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Figure  A-19.  Histogram  of  Residuals  for  Adjusted  Sample  Data  (TTC) 


APPENDIX  B 


This  appendix  contains  a  FORTRAN  IV  listing 
of  the  program  used  to  provide  parameter 
estimates  used  in  SPSS  subprogram  Nonlinear. 
To  execute  program,  the  user  must  provide 
the  number  of  observations,  starting  values 
for  parameters,  actual  observations,  and 
trial  numbers  for  each  observation. 


noon  ooon  nonoo  ooooo 
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PROGRAM  PARA  MS  ( I  NPtJT  ,  OU)  PUT  »TAPEE=I  NPUT  ,  T  APE  6  =  OUT  PUT ) 
01  ME  NS  ION  03S (700) .TIME (700) 

REA0*,N,A,3,<0BS(I) ,I=1*N),(TIME(T> ,I=1,NI 

THIS  PROGRAM  SOLVES  FOR  PARAMETERS  **A‘*,  ”0“  3Y 
MINIMIZING  THE  SUM  OF  SCUAREO  ERRORS  USING  A 
GRADIENT  TYPE  SEARCH  PROCEDURE. 

00  11  <=1,100 
31*32=0.0 
00  1?  1  =  1, N 

F 1  =  (  0  »  0  <•  (  1.  Q  /T I  ME  (I  >  )  ) 

F2=(0,0+ (A/ (TIME < I)  *»R ) ) * AL OG ( 1 . 0 /T IME (I) )) 

F11=F11*F1*F1 

F12=F12fFl*F2 

F21=F21+F2*F1 

F22=F22+F2*F2 

31=3l  +  (OnS(I)  -  (A/(TIME(I)**B>  ))*F1 
32=32+<09S(I) -(A/ (TIME (I)  ***li  ))*F2 
12  CONTINUF 

SOLVE  FOR  ELEMENTS  OFOIPECTION  VECTOR  THAT  WILL 
IMPROVE  OUR  ESTIMATES  OF  PARAMETERS  "A”  AND  "B"» 

FIND  ••01,»  AND  "02"  DY  SOLVING  A  2X3  MATRIX. 

F 111  =  1  *  0 

F121=F12/Fll 

Q11  =  IU/F11 

F211=1.0 

F221=F22/F21 

Q21=D?/F21 

CONDUCT  MATRIX  ADDITION  TO  OBTAIN  ZERO  COEFFICIENT 
FOR  01  IN  SECONO  EQUATION. 

F 1 1 2=F 11 1 
Fl£  2=F 121 
312=311 

F212=F?11-F111 

F222=F221-F121 

322=321-311 

GET  COEFFICIENTS  OF  02  IN  30TH  EQUATIONS  AT 
SAME  VALUE 

F113=F112*(F222/F122) 

F123=F122»(F222/F12?) 

Q13=Q12*(F222/F122) 

F213=F212 

F223=F222 

323=Q22 


O  O  O  O  o  O  O  o  o  o  o  oooo 
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CONOUCT  MATRI>  AO^ITTON  TO  OBTAIN  ZERO  COEFFICIENT 
FOR  0?  IN  FIRST  EQUATION 

F11h*F113 

F12*=F123-F223 

014=013-023 

F21USF213 

F224-F223 

024=023 


PUT  IN  STANOARO  FORM  WHERE  COEFFICIENT  OF  01  IN 
EQUATION  1  EQUALS  1  AND  COEFFICIENT  OF  02  IN 
EQUATION  2  EQUALS  2  ANO  FIND  THE  VALUES  FOR  01 
AND  02  RESPECTIVELY 

FU5  =  Fliu*<Fl22/F2?2> 

F125=F12* 

Q15  =  Qi4MF122/F222l 
F2 1 5=F  2 1- 
F22E=1.0 
Q25  =  Q24/F  22** 

H=0. 0-1.0 


IF 

( FI  1 5 

•  G  T« 

HI 

GO 

TO 

13 

F 1 1 5=F1 1  £ 

♦H 

U5 

=  Q 1 5  *  H 

13  IF 

(F225 

,GT, 

HI 

GO 

TO 

14 

F22  5  =  F  225  *H 
Q25=Q25*H 
14  01=U5 
02=025 


FIND  MAXIMUM  DISTANCE,  VMIN,  TO  PROCEED  IN  NEW 
DIRECTION  FROM  CURRENT  PARAMETER  VECTOR  TO  GAIN 
AN  IMPROVEMENT  IN  MINIMIZING  Sl’M  OF  SQUARED  ERRORS 


W  =  l.  0 
21  A 1  =  A 
31  =  E. 

A2=A+<W*.  5>*01 
92*3MW*.  5)  *02 
A3*A+W*01 
33=0+W*02 
QA1=QA2=QA3=Q. 0 
00  15  I  =  1  ,  N 

QAl  =  QAl  +  COTSf I) - (Al/< TIME (11**31) II  **2 
QA2=QA2M  093(11  - (A2/ < TIME (I) **B2> > ) **2 
QA3  =  QA3+C03S(II- (A3/ ( TIME  (I)**3?l 11**2 
15  CONTINUE 

VAL1=QA1+QA3 
VAL2  =  2*  0*QA2 


IF  (VAtl  , EQ*  V  AL  2 1  GO  10  18 

VHINsO.5  +  ,25*  MA1-QA7) / (RA3-2.0*OA2*QAl> 

18  AV=A+VNIN*Ol 
3V=3+VMIN*D2 
OV=0.0 

00  10  1*1,  N 

'5tfs5\/M09sm-<Av/<Tt>iE 

19  CONTINUE 
VAL  =QV-QA 1 

IF  tVAL  .LT.  .00001)  GO  TO  20 
W=W*.D 

WRIT£(b.97>  OV 

97  FORNAT  ('*  ‘V"QV  =  ".Flv.ft) 

GO  TO  21 

20  011=01 
022=02 

I K ( 0 1 1  ,GT.  a.O)  GO  TO  31 

oii=co*o-i.o)*oii 

31  IF  (Dll  .GT.  .000001)  GO  TO  32 
IF  (022  *  GT .  0,0)  GO  TO  3? 

022=0. 0-1. 0>*D2? 

33  IF  (022  .LT.  .000001)  GO  TO  16 

32  A  =  A*VHN»U1 
9=B+V*TN*02 

11  CONTINUE 

16  SE=(QAl/(N-2))**0  .5 
WRITE  (6.17)  A.B.SE 

17  FORMArC*  "/"PARMA  =  " , F If . « , 51 , "PARM3=  ", 
CF1 1. 8 . 5V ♦ "STD  OEV=  ",F11.8> 

STOP 

END 
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APPENDIX  C 


This  appendix  contains  an  execution 
run  for  the  Lightweight  Company 
Mortar  System  sample  data  using  the 
SPSS  Nonlinear  subprogram. 


BLAST  78/05/10 
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APPENDIX  D 


This  appendix  contains  plots 
of  the  final  fitted  models 
selected  In  Chapter  IV. 
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